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INTRODUCTION 


Definition 


The routine which accepts a set of symbolically coded instructions, 
translates them into a machine language, and at the same time also 
assigns either symbolic or regional addresses to absolute machine 
addresses ‘is called an assembly routine. It may or may not allow 
for macro instructions. The assembler helps with the symbology 
problem confronting the programmer, expands the number of 
apparent operations available from the machine and eases the chore 
of assigning storage locations. 


An extension of the assembly technique is the compiling routine. A 
compiler permits more complex macro instructions than an 
assembler, and often excludes machine instructions, even in 
symbolic form from the language which it can accept. While the 
assembler generally deals with each instruction independently of all 
others, the compiler attempts to capitalize on the information which 
is contained in the structure or logic of the problem. The context 
in which each instruction is nested is important. Commonly, a 
compiler is language or problem-oriented in that it accepts as input 
the language. and operations or a.particular class of problems. 


List Processing Languages. 


The allocation of storage space even with sophisticated automatic 
coding languages can be a major difficulty, because many times it is 
impossible to foresee how much information will be produced or have 
to be stored at each phase of the routine. The concept of list stores 
and list processing languages was developed in order to surmount 
such difficulties, 'List'' is used in the conventional sense to 
designate a linear sequence of pieces of information which for some 
reason are to be associated. The length of a. list is not necessarily 
fixed. This implies a.capability for inserting or deleting an entry 
anywhere along-the list. Such a list is frequently called pushdown. 
If the only entry point is at the top, an entry on a list may be the 
name of a sub-list. The sub-list in turn may reference another 
sub-list. Such a collection of lists and sub-lists is called a list 
structure. 


List processing languages permit such operations as 1). Insert an 
entry on a list, 2) Delete an entry ona list, 3) Create a list, 

4) Destroy a list, 5) Coalesce or concatenate a list, and 6) Search 
a list for a given symbol. They are sometimes called symbol 
manipulating languages, and include LISP, and the family of 
Information Processing Languages, IPL-V being the best known. 
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Such languages facilitate storage allocation and give complete 
freedom for development and use of recursive subroutines. 


All of the List Processing Languages employ pushdown stores or 
stacks to hold the return address and the parameters used by a 
subroutine. If, during the execution of a first subroutine, a second 
subroutine is entered, the return address and parameters of the 
second are simply pushed down on those of the first. In this manner, 
the latter are preserved and pop up upon completion of the second 
routine and return to the first. When the stack has sufficient 
capacity, subroutines can be nested to any depth and used recursively. 


Complier Concepts 


One of the basic concepts employed when writing a compiler is that 
information which is phased for further use when translating can be 
conveniently kept in a stack. A stack is distinguished from other 
types of tables by the fact that only the item at the top of the stack, 
the youngest item, is important at any given time. A second concept 
implemented is that comparison of adjacent operator properties 
provides a valuable criterion, and no more than this is needed to 
correctly interpret any formula. The third technique is that 
parentheses can be treated as operators with priorities, thus 
enhancing the algorithm. 


Roughly, a compiler will scan a long expression, left to right or 
right to left, until some operation is found which can be performed 
regardless of what will occur in the remainder of the expression. 
This operation is discovered through a force table or force codes. 
As soon as an operation is found which can be performed regardless 
of future input, it is accomplished. Meanwhile, the unused portion 
of the formula is retained in the stack, The compiler must now be 
able to switch back and forth discover which operators can be 
forced and what is available within the stack, 


Compiler Construction 


Compilers allow the programmer to write the problem solution in 
broad source statements, i. e. macro instructions. These 
statements are analyzed by the compiler which in turn generates 

the necessary symbolic instructions. The segment of a compiler 
which interprets a macro instruction and develops the required 
symbolic instruction sequence is called the macro generator, For 
each macro instruction included in a given compiler language, there 
is a separate macro generator. The entire complex of macro 
generators provided in a compiler is in only one section of that _ 
compiler. The generators are used ina specific phase of the entire 
translation process. From this point the final portion of a compiler 
performs the same functions as any other assembly program. Every 
compiler includes an assembly process to affect final conversion to 
machine language. 
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Il. 


XTRAN 


A. INTRODUCTION 


History 


As the complexity and quantity of machine applications have increased, 
problem oriented programming languages have become more 
popular. FORTRAN and ALGOL for scientific formula oriented 
programs, and COBOL for business type problems are being used 
more widely than ever before... It is only natural then that attempts 
have been made toward a problem oriented approach to compiler 
writing, especially when we consider the magnitude of the job of 
efficiently writing a compiler. XTRAN is a language that was 
developed by the IBM Corporation for use by the Programming 
Systems Department. The XTRAN language exists in a slightly 
different form for different machines. Therefore, the following 
discussion will not deal with an official version of the language, 
but instead, general concepts and techniques used, 


String Techniques 


In FORTRAN, the basic element dealt with is the formula. In 
XTRAN we are concerned with the sequence of the characters called 
the string, which must be broken down and analyzed to such an 
extent that linkages to closed subroutines can be generated by a 
subsequent program called the macro expander program. For the 
7090 the output of XTRAN must be such that the macro expander will 
be able to generate a TSX followed by an appropriate number of PZE 
instructions which would cause the desired subroutine linkages to 

be formed. 


What types of operations can we expect XTRAN to perform on strings? 
One useful bit of information about a string during analysis would be 
its length or number of symbols called the NORM. Consequently 

one operation found in XTRAN is NORM (ST, 1} which would compute 
the NORM of the string named ST and store the result in location l. 
Another operation is used to isolate the end symbol of the string in 
order to analyze it. Further operations remove a symbol from the 
string, insert a symbol in a string, join two strings together or find 
the first occurence of a symbol in a string. 


Compilation 


One of the most important aspects of XTRAN is that it is a language 
that can be used to compile itself. The approach taken can be to 
write a few basic XTRAN operations in the machine symbolic 
language and after these are assembled and debugged write more 
powerful operations in XTRAN. The language will generate a set 
of symbolic instructions which can be reassembled with the prior 


version of the language and included in the new language. XTRAN 
has the ability to scan source statements and compile itself in 
FORTRAN in say the 7040 -- then, if someone wants a FORTRAN 
compiler for the 1440, XTRAN does not have to be rewritten for it. 
Only the macro expander for the 1440 to take the output of XTRAN 
must be written in order to generate the desired linkages. What 
we have in XTRAN then is a'‘truly general purpose compiler writing 
program. 


High-and Low-Level Languages 


Before commencing with a definition of XTRAN some terminology 
should be clarified, When considering high-and low-level languages, 
the low-level language is close to the machine language and would ~ 
be similar to 1401 or 1440 SPS. A high level (or higher level) 
language is more removed from machine language and would 
encompass such languages as COBOL, FORTRAN, and ALGOL. 


Functions and Procedures 


Consider the mathematical computation: KNEW = 1/2 (A/KOLD + 
XOLD). This is the appropriate mathematical format for expressing’ 
the function XNEW. The proper format for the function XNEW as 
stated in the FORTRAN language would be XNEW =]. /2.* (A/XOLD + 
XOLD). A language which has only procedures will express this 
same function as TEMP 1 = A/XOLD; TEMP 2 = TEMP 1+ XOLD; 
XNEW = TEMP 2/2. Inthe latter procedural language as soon as 

the value was produced, a Place had to be generated for it £0 be 
stored. 


Functions are rules for producing a particular value. 1/2 (A/XOLD + 


XOLD) was a function of XNEW expressed in the correct mathematical 


notation. A procedure differs from a function in that no value is 
produced. A typical example of a procedure would be R=S. In 
FORTRAN the value of S is placed in a location that is labeled R. 


No new value has been produced and the old value of S has not changed. 


Prefix, Index, and Suffix Notation 


All of the notations that we have used to this point, have been of the 
type infix. Infix notation means that the operator is included between 
or inside the two operands. Infix notation would include the value 
A+B. Prefix notation has its operator preceding the two values, or 
operands. An example of addition of two values expressed in prefix 
notation would be + AB. The third type of notation that is used is 
suffix. The suffix notation has the operator following the two 
operands. An example would be AB +, 


Further examples of infix notation would be TEMP 1 = A/XOLD and 


TEMP 2 = TEMP 1+ XOLD. Prefix notation is similar to a three 
address machine language. The earlier two infix statements would 
be expressed in the following manner: /A, XOLD, TEMP 1 states 
that A should be divided by XOLD and placed in TEMP storage 
location 1; and + TEMP 1, XOLD, TEMP 2 states that TEMP 
storage 1 should be added to XOLD and the rosa placed in TEMP 
storage location 2. 


Language Ambiguity 


For procedures, there is little difference between prefix and infix 
notation, however, this is not true of functions. Since infix is not 
completely specified it requires rules of precedence. Prefix 
notation is specified, and this is called Polish or Parenthesi« - 
free notation. 


Functional notation is found many times in the form of a prefix, for 
instance: F(A, B); G(X, Y,Z); or R(S,T(M,N);,;T). With functional 
notation operators may not be binary and they do not refer neces sarily 
to functions or procedures. 


Polish Notation 


Let us consider a further example in prefix notation, the. math. 
function: ¥=S/T(A/X+X). A mathematician working with this 
formula would know what rules of precedence are-needed in order 
to compute the correct result. However, the computer must be 
given additional information. Therefore, Polish notation or 
parenthesis-free notation is used. The one rule in using Polish 
notation is that the operand always has its two addresses. immediately 
to the right. The preceding math statement would be revised to: 
y=*/ST+/AXX. Since the operand always has its two addresses 
immediately to the right, the first operand in the scan from left to 
right to be performed would be /ST. Here S would be divided by T 
and a temporary value would be placed in this area, TEMP 1. 
Scanning again from left to right the next group of values that has 
an operand with two addresses to the right is /AX therefore, X 
divided by A will be the next computation performed, After /AX 
has been accomplished, two addresses now follow the +,’ i.€., 
+TEMP 2 X. The third operation would be to add TEMP 2 to X, 
Two addresses, TEMP 1] and TEMP 2 now follow the original 
asterisk. TEMP 1] and TEMP 2 are multiplied in order to compute 
the value Y. 


B. PUSHDOWN STACKS 


Consider the function (A+ B) and infix notation (+ AB) in Polish 
notation. Within infix notation equally simple examples can be 

given such as (A+ B * C) which in Polish notation would be expressed 
as (+A * BC). Again in infix (Y=Z + R) would be expressed in 

Polish as (=¥ + ZR). It becomes more complicated when taking the 
example: [(B-C)#¥A * [x-Y4T. . This would have to be expressed 
in Polish notation as **=BCA-X#YT. The scan may be performed 
from right to left. (See Figure 1). When the operator is seen, it is 
combined with the two names on the right, producing a new name 
which is the name of the result, or the scan can be made from left 

to right, where after finding two names without an intervening OP 
Code the combination would be made with the preceding OP Code. 

In implementing a right to left scan of the example #*-BCA-X4YT 
with one pushdown and output on every operator, we would have 
TEMP 1 containing the calculation of Y to the power T; TEMP 2 
containing the value X - TEMP 1; TEMP 3 containing the value 

B - C; and TEMP 4 containing TEMP 3 multiplied by A; and TE}{P 5 
containing the product of TEMP 4 (which contains the results of 

B - C multiplied by A) and of TEMP 2 (which contains the results of 
X - Y to the T power). 


((B-cy*4] * (K-YfT] or #*-BCA-xTYT 





Stack Output 
Y Ys Ta Ty 
T 
| x -X,T),T2 
Ti 
B 
Cc -B,C,T3 
A 
T2 
T3 
A *T,,A,T 
T> 3 4 
T4 *T,,T,,T. 
T 
Zz 
Figure | 
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If the implementation was from left to right TEMP 1 would include 
the result of B-C;, TEMP 2, the product of TEMP 1 and the value A; 
TEMP 3, the value of Y to the T power; and TEMP 4 would contain 
the value X-TEMP 3. (See Figure 2). In the first phase of the 
operation, from bottom to top of the list, an **- would have been 
pulled into our operator list. In the corresponding operand list 
reading from bottom to top, appear ((( which are the separators for 
the operators, followed by Band C. Once two operators are 
adjacent the last operator in the list which was a minus can be 
performed, thus causing the operation of B-C stored in TEMP l. 


Phase 2 has the operator stack containing ** with the operand stack 
containing ({ TEMP land A. Once again, two operands have been 
entered into the operand stack without separators. Therefore, the 
last operator in the operator stack is associated with the last two 
operands and the calculation of multiplying A by TEMP 1 is performed. 
TEMP 2 is created and put into the operand stack. 


In Phase 3, reading from bottom to top, the operator stack contains 
*- J, and the operand stack has (T2(X(YT. Since two operands are 
not separated, the operation of Y raised to the T power can be 
performed, 


The operator stack has in the fourth phase an *- with the operand 
stack of (T2(XT3. X-T3 is performed to create T4. 


Only the * remains in the operator stack in phase 5, with (T2 followed 
by T4 in the operand stack. T2 can then be multiplied by T4 and 

in the final phase, the operator stack would be empty and the 

operand stack holding T5. 


#*-BCA-XfYT 


Figure 2 


**-BCA-X#YT 


lop. joperandlop. [operand|Op. |operand|Op. loperanalon. loperand|op, | operand 


T 

Y 
Cc ( T3 
B A xX x 
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Right to Left and Left to Right Scans 


It should be noted at this point that in the left to right scan of : 
Figure 2.the output number of ('s in the stack is equal to the order 
of the operator. .In the right to left scan the order of the operators 
are kept with the operators in the operator stack. Where the 
operand is added, a check must be made to see whether or not there 
are enough to satisfy the top operator. 
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Cc. STRING AND SYMBOL MANIPULATION 
Definition 
By the term symbol we mean a character, an atom, a basic element, 
such as A,+, or $. A string is an ordered sequence of symbols 
such as A+ B*X-{Y+Z). String literals tend to be notative, such as 
F(ABC) or S(3, ABC) or JOE S (3, ABC). 
REMNS 


One requirement of any string manipulative program would be to 
remove and insert operations. Let us take the following example: 


A*B-R/S 
Ti - R/S 
Tl=T2 
T3 


In order to remove and insert operations it is convenient to express 
this with one statement in a compiler language. In XTRAN the 
statement is in the following format: REMNS(STA,N,SYA). 


REMNS represents the phrase REmove N Symbol; STA, the name of' 
the STring; N, the position in the string; and SYA where it is to be 
stored after removal. This is termed functional notation. 


A string is expressed in XTRAN with format: S(N,S1,S2,S3...SK*). 
All N symbols are unconditionally a part of the string and will continue 
to be until the next ) is met. As an example, the string 
(93,RX)ABCDE). This string includes N which is 3 symbols, being 
RX and ) and, according to the definition, the values through E. 

This string is terminated by the right parenthesis following the 

letter E. 


Strings are assembled and held in main memory through dynamic 
storage allocation. (See Figure 3). Main memory might contain, 
as an example, two words JOE and FREE, with an additional area 
called the string area. The string header is JOE. In the address 
portion of JOE is the location of the first piece of string which is 
2,000. FREE has all locations that are not being used by an active 
string. 


The string words are also divided into two portions. At location 


2,000 exists a value X with an address of the next value of the 
string, which is 1,000. 
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At location 1000 the string value Y has its pointer at 3000, and at 
3000 value Z, pointer of zero. The string might be expressed 
as S(0, XYZ). 


NORM 


An additional value that would be required for string manipulative 
procedures would be the NORM of a string, which is the number of 
symbols in the string. A NORM word is always present in any 
string definition. The NORM word is divided into two sections. 

The right half is the pointer to the end of the string, and the left 

half contains the NORM value. The address of the NORM word of the 
string is in the string header. 


Figure 3 


Memory Map 


Main Memory 


JOE String Area 


Locati 
fee 3000 | i000 
Location 


J000 1 2000 


Location 
3000 





Unary and Binary Operators 


Only unary operators, i. e., one operator per value have been 
discussed, However, in many calculations, binary operators are 
also present. An example would be: A*(-B). Here two operations 
are required for B. First of all B must be set to a negative value 
and secondly this negative value of B must be multiplied with A. 


Consider another example: (A*(-B))*(-(X-Y)). Expressed in Polish 
notation, the formula would be: **A-B--XY. The first level of 
operations would be to find the value X-Y. Within the same 
operational level, B must be set to a negative value. The second 
level of operations would multiply A times the negative value of B 
and set the value of X-Y to its negative form. The third operation 
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would be to multiply the first value by the second value, i. e., 
multiplying A-B and the negative value of X~Y. 


Using parentheses, the formula would be expressed as: 
*(*(A-B))-(-(XY))). Or, ina slightly different manner: 
*(*(A, -(B)), -(-(X, Y))). 


A third way of expressing the same equation would be to substitute 
the letter F for the *, G for the binary minus, and H for the unary 
minus. Completing the substitution we would have the formula: 
F(F(A, H(B)), H(G(X, Y))). 


Functions and Procedures 


Symbol manipulative procedures or routines may be divided into 

two types, those of functions and procedures. As defined earlier, 
functions are rules for producing a particular value, anda 
procedure differs from a function in that no value is produced. 

(See Figure 4). Functions in the symbol manipulation routines are 
contatenate, norm, get n-th symbol, first occurrence of symbol, 
and strings identical. Procedures are free string, remove n-th 
symbol, insert, add symbol to list, push, replace n-th symbol, 
replace symbol by symbol, string assign, set pointer, and sequence. 
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Figure 4 


Symbol Manipulation Routines 


The following prefixes will distinguish the type of value which a name represents: 


ST 

SY 

Ior N 

L 

FUNCTIONS 
Representation 


CONCAT 
(STA, STB) 


NORM (ST) 
GETNS (ST, N) 


FOS (ST,SY) 


STID (STA,STB) 


string name 

symbol name 

integer name 
statement name (label) 


Type of Value 


Name ° Produced 
Concatenate String 
Norm Integer 


Get n-th symbol Symbol 


First occurrence Integer 
of symbol 


Strings 


Boolean. 
identical : 


Description 


Produces a string which is STB 
concatenated to the end of STA 


Produces an integer whichis _ 
equal to the number of symbols 
in ST 


Produces a symbol which is the 


n-th symbol of ST 


If SY is in ST, the value produced 

is an integer equal to the position 

which the fir st SY occupies. 
Otherwise the value is zero 


If STA and STB are identical 
a true value is produced 
otherwise a false value, . 


Example 


JOE = S(O, AB) 
SAM= S(O, XYZ) 
CONCAT(JOE,SAM) 
produces the string 
S(O, ABXYZ) | 


wn 
— 


ANN = S(O, RST) 
NORM(ANN) produces 
the integer 3. 


BOB = S(O, MNXY) 
GETNS(BOB, TWO) 


produces the symbol N 


JOE = S(O, RMQZMY) 
SYM is a location 
containing the symbol 
M. FOS(JOE,SYM) 
produces the integer 2. 


PROCEDURES 


Call Name 


FREE (ST) Free string 


REMNS (ST,N,SY) Remove n-th symbol 


INSRT (ST,I1,SY) Insert 


ADDSL (ST,SY) Add symbol to list 


PUSH (ST,SY) Push 


REPNS (ST,N,SY) Replace n-th symbol 


REPLS (ST,SYA,SYB) Replace symbol 
by symbol. 


STASN (STA,STB) String assign 


Call Name 


POINT (ST,1, PT) Set pointer 


SEQ (PT,SY,L) Sequence 


Description 
Free ST. 
Set SY equal to the n-th 


symbol of ST. Remove 
the n-th symbol of ST. 


Insert SY in ST between the 
i-th and (i+1)th symbols, so 
that SY becomes the (it 1)st 
symbol of ST. 


Add SY to the end of ST 


Add SY to the beginning 
of ST. 


Replace the n-th symbol of 
ST by SY. 


Wherever SYA occurs in 
ST, replace it by SYB. 


Set STB equal to STA 


Eeemaple 


BOB = S(O, MNXY). REMNS(BOB, 
TWO,SYA) will cause the symbol N 
to be stored in SYA and will change 
BOB to S(O, MXY). 


ANN = S(O,RSTW), SYL contains the 
symbol L. INSRT (ANN, THREE, SYL) 
will change ANN to 5(O, RSTLW),. 


a: 
SAM = S(O, XYZ). SYA contains the 
symbol A. ADDSL(SAM,SYA) will 
change SAM to S(O, XYZA),. 


SAM = S(O, XYZ), SYA contains the 
symbol A, PUSH (SAM,SYA) will 
change SAM to S(O, AXYZ). 


JOE = S(O, BXRS). SYQ contains the 
symbol Q,. REPNS(JOE, TWO,SYQ) 
will change JOE to S(O, BQRS). 


BOB = S(O, RMRSTRQ). SYR contains 
the symbol R. SYM contains the 
symbol M.REPLS(BOB, SYR,SYM) 
will change BOB to S(O, MMMSTMQ). 


JOE = S(O, XYZ). SAM = S(O, AB). 
STASN (JOE,SAM) will change SAM to 
S(O, XYZ) and leave JOE unchanged. 


Description Example 
Set pointer PT to point to 

the i-th symbol of ST. 

Set SY equal to the symbol 

that PT is pointing at. Advance PT 


to the next symbol in the string. 


If PT was past the last symbol 


in the string, transfer to L. 
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D. LANGUAGE ONE 


Introduction 


A very simple language such as an SPS assembly, where there are 
no macros, is certainly efficient and straight forward. By having 
only procedures available a one-for-one assembly can be assumed, 
For instance, to express the function NORM(ST) in Language One 
would require an OP Code of FUNCL with an address of norm, 
followed by an OP Code of PAR with an address of ST, and an 
additional OP Code PAR followed by the address N, as in Figure 5. 


Figure 5 
Operation Address 
FUNCL NORM. 
PAR = (ST 
PAR N 


If we wish to express the function, REPlace N Symbol, the OP Code 
of FUNCL would appear with an address of REPNS followed by three 
parameters for OP Codes with the addresses being respectively ST, 
N, and SY. (See Figure 6). , 


‘Figure 6 
Operation Address 
FUNCL REPNS 
PAR . : ST 
PAR N 
PAR SY 
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Programs in Language One 


Examples of programs written in Language One, the scan, anda 
description of the boot strap procedure appear in Figures 7, 8 and 9. 
The symbols mean following: colon, a label; comma, a parameter; 
), a parameter; (, a procedure; a semi-colon, noise; and , and ) 

are interchangeable as several labels are possible. 


Figure 7 


Examples of Programs in Language 1 


FOS (ST, SY, I) 

NORM (ST, IA); ASSIGN (ONE, 1); LC: GETNS (ST, I, SYA); 
EQUAL (SYA, SY, B); CONDTRA (B, LA); EQUAL (IA, I, B); 
CONDTRA (B, LB); ADD (ONE, I, I,); GO TO (LC); 


LB: ASSIGN (ZERO, 1); LA: RETURN (DUMMY); 


SQUARE ROOT 

ASSIGN (A, XOLD); LB: DIV (A, XOLD, TA); ADD (TA, XOLD, TB); 

DIV (TB, TWO, XNEW); SUB (KOLD, XNEW, TC); ABS (TC, TD); 

LESS (TD, EPSILON, TE); CONDTRA (TE, LA); ASSIGN (KNEW, XOLD); 
GO TO (LB); LA: STOP (IDENT); 
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START 


LB 


COLSR 
LPRSR 
COMSR 


LC 


LEND 


Figure 8 


Scan for Language 1 


FREE (NAMST); 

INPUT (SRCST, LEND); 
NORM (SRCST, TA); 

TRAEQ (TA, ZERO, LA); 

REMNS (SRCST, ONE, CURSY); 

TRAEQ (CURSY, BLKSY, LB); 

TRAEQ (CURSY, SCLSY, LB); 

TRAEQ (CURSY, COLSY, COLSR); 

TRAEQ (CURSY, LPRSY, LPRSR); 

TRAEQ (CURSY, COMSY, COMSR); 

TRAEQ (CURSY, RPRSY, COMSR); 

ADDSL (NAMST, CURSY); 

GOTO (LB); 

: CONCAT (COLST, NAMST, NAMST; GO TO (LC); 
CONCAT (LINKST, NAMST, NAMST); GO TO (LC); 
CONCAT (COMST, NAMST, NAMST); 

OUTPUT (NAMST); 

FREE (NAMST); 

GO TO (LB); 


STOP (IDENT); 
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Where: 
SRCST - Source string 
LEND = Label, end of file routine 
CURSY = Current symbol 
BLKSY = Blank symbol 
SCLSY = Semi-colon 
COLSY - Colon 
LPRSY = Left parenthesis 


RPRSY = Right parenthesis 


Figure 9 


Bootstrap Procedure 


1. Write "basic' subroutine in machine language. 


2. Create macros which are calling sequences to 
these subroutines. 


3. Additional subroutines can now be written using 
these macros. As each subroutine is written, 
a macro can be created for it. 


4. Scan can now be written with macros, one per 
card. Assembly deck must include the macro 
definitions, and the subroutines. Output of this 
assembly will be the compiler in machine 
language. 


At this point the following problem could be worked: write replace 
symbol REPLS (ST, SYA, SYB) using the functions of NORM, 
REMNS, ADDSL, PUSH, FREE, CONCAT, POINT, SEQ, and 
INSRT. (See Figure 9A). 
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Figure 9A 


REPLS (ST,SYA,SYB) 


LA SET STR=ST 
POINT (ST, (FOS(STR,SYA)), PT) SET POINT TO FIRST OCCUR 
SEQ (PT, SY, LB) . 


INSRT (ST, (FOS(STR,SYA)),SYB) INSERT B 


REMNS (ST, (FOS(STR,SYA)),SYA) REMOVE A Ll 
SET STR=PT 
TRA LA 

LB END 12 


: Balancing Parenthesis 
Consider the problem of balancing parentheses on input. Add to the L3 
“beginning and to the end of the string any required parentheses to 


make matched pairs with the minimum number of additions. 
Figure 10 illustrates one solution, with the block diagram of Figure 11. 


L4 
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SET 

SET 

SET 

SET 
POINT 
SEQ 

SET 
POINT 
SEQ 

SET 

SET 

SET 

SET 

POS 
TRATRUE 
NEG 
FRATRUE 
ADDSYL 
POINT 
SEQ 

SET 


TRA 


Figure 10 


Insert Parenthesis 


ST1+ST 

ST2+ST 

STR=ST1 

STL=ST2 

(ST 1, FOS(STR,SYRP), PTR} 
(PTR, SYR, L16 

STR=PTR 
(ST2,FOS(STR,SYLP), PTL) 
(PTL,SYL, L11) 

STL=PTL 

L=PTL 

R=PTR 

I=R-L 

1,B 

B, L6 

1,B 

B, L4 


(STA, SYLP) 


(ST 1, FOS(STR, S¥RP), PTR) 


(PTR,SYR, L16} 
STR=PTR 


L3 
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Ww 


nm OH aA 


L6 


Lil 


L16 


Li7 


L19 


L18 


TRA 
ADDSYL 
POINT 
SEQ 

SET 
TRA 
POINT 
SEQ 

SET 
ADDSYL 
TRA 
SET 

SET 

SET 

POS 
TRATRUE 
NEG 


TRATRUE 


ADDSYL 


. ADDSYL 


CONCAT 


CONCAT 


SET 


HALT 


Ll 

(STA, SYLP) 

(ST1, FOS(STR,SYRP, PTR) 
(PTR,SYR,L17) 

STR=PTR 

Lll 

(ST2, FOS(STL, SYLP, PTL) 
(PTL,SYL, L17) 

STL=PTL 

(STB, SYRP) 

L16 

L=PTL 

R=PTR 

I=R-L 

1,B 

B, L18 

1,B 


B,L19 


(STA, SYLP) 
(STB, SYRP) 
(STA,ST) 
(STA, STB) 


ST=STA 
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ll 
12 
12 
12 
12 
16 
16 
16 
20 


20 


- 17 


17 
17 
17 
17 
17 


17 


19 
19 
18 
18 
18 


18 


©) 





OH Find 


next ) 
last 


Figure 11 


Insert Required Parenthesis 


Initialize 


@ Find 
right ) @ 
Last? 





Compare 
FOS 


— 





i) Add 
(to STA 


(2) Find 


next ) 
Last? 
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(6) Find 


next 


Last 


Compare 
FOS 





) to STB 


E. LANGUAGE TWO 


While Language One has only procedures with the functional notation, 
Language Two has both functions and procedures with functional 
notation. Figure 12 is an example of a program written in Language 
Two. 


Figure 12 


Examples of Programs in Language 2 


FOS (ST,SY, 1) 

ASSIGN (ONE, 1); 

LC: CONDTRA (EQUALIGETNSI(ST, 1), SY}, LA); 
CONDTRA (EQUAL(NORM(ST), 1), LB); 

ASSIGN (ADD(ONE, I), 1); GO TO (LC); 


LB: ASSIGN (ZERO,I); LA:. RETURN (DUMMY); 


SQUARE ROOT 

ASSIGN (A, XOLD); 

LB: ASSIGN (DIV(ADD(DIV(A, XOLD), XOLD), TWO), XNEW); 
CONDTRA (LESS(ABS(SUB(XNEW , XOLD), EPSILON), LA); 


ASSIGN (KNEW, XOLD); GO TO (LB); LA: STOP (IDENT); 


Polish Notation 


All examples in this sectionwil! be done in Polish or (Parentheses- 
free) notation which, by the way, is used for the B5000. Polish 
notation is a method of expression which was developed by 

Jan Lukasiewicz, a Polish mathematician. Since "Polish" is-much ~ 
easier to say than the mathematician's name, his method of notation 
has been dubbed ‘'Polish"'. 
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Instead of writing A+B, the notation would be +AB. A+B+C would be 
transformed to ++ABC, where A+B is executed first and the quantity 
A+B is added to C as a second operation. 


The hierarchy of operations is: exponentiation, denoted as an arrow 
pointing upward; multiply and divide; and add and substract. The 
example A+(B*C) would be written as +A*BC. 


Consider the formula: {(B-C)}*A)*(X-(Y#T)}. This would be 
expressed in Polish notation as **-BCA-X4#YT. Performing a 
right to left scan to find the order of operations, the first operation 
the machine would perform is raise Y to the T power and place it in 
TEMP 1. 


The second is to substract TEMP 1 from X and store it in TEMP 2. 
The third operation substracts C from B and places it in TEMP 3. 
The fourth phase will multiply A and TEMP 3 and place the product 
in TEMP 4. The fifth is to multiply TEMP 3 and TEMP 4 and place 
the product in TEMP 5. The output as expressed 3 in machine 
language is shown in Figure 13. 


Figure 13 


Y,T,T1 
X,T1, T2 
B,C,T3 
T3,A,T4 
T4,T3,T5 


> 


% a4 


Figure 14 illustrates the same solution, with the stack activity. 


Figure 14 


B 
Cc T3 
Y x JA A T4 
ay Tl T2 T2 . |T3 T5 


fY,T,T1 -X,T1,T2 -B,C,T3 ¥*T3,A,T4 *T4,T3,T5 
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In uSing a left to right scan, the pushdown storage is represented 


in Figure 15 with two pushdown stacks. 
Figure 15 
Operator ear ree 
A 
- ( 
( 
( 


-C,B,T1 


uu # 


3 Operator Operand Operator 


2: A 5: ; 
, ( 
* { * 


*A,T1,T2 


+e 


Operator MP SEE eT : Ope rator 


Ue 


AT, Y,T3 


Using a more complex problem a transformation into Polish will 

be accomplished (Figure 16) and the two types of stack analysis will 
be performed. They are: 1) the right to left analysis as shown in 
Figure 17, and 2) the left to right me ots Dh (B-DF 


stacks, in Figure 18. The problem is 
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*(B-D 


- Operand 


ES: 
x 


( 

Tz 

( 
-T3, x; T4 
Operand 


T4 
T2 
( 


*T4,7T3,T5 


Operand 


T5 


a speeh oe 


SJ] - &/ (y- 


Figure 16 


*-**A-BD-RS /x-yz-LM 








a eee 


Figure 17 
A 
R D Ts 
Y x S Ta T4 T4 
is Z Taj | T3 T3 T3 T3 
Mi] ty fra tray | ta} ta | ot 
(i) ® 6 a © @ 
@ L-MeT, 9 (-L,M,T)) | 
@ Y-Z=T, (-Y,Z,T>) 
@ X/T2=T, (/X,T2,T3) 
@ R-S=T4 (-R,S,T4) 
3) B-D=Ts . {(-B, D,T5) 
@ A*T5=T, (*A,T5,T6) 
@ Te*T4=Ty (*T¢,T4,T7) 
T7-T3=Tg (-T7,T3,Tg) 
® Tg*T)=T9 Answer (*Tg,T) »T9) 
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Ee 


Tg 
‘Ti T9 
@ a9) 


Figure 18 


“*-##A-BD-RS /x-yz-LM 


Left to right (2 pushdowns). 
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Ts, L 

T4 ( Tg 

( T? T7 

( ( J (| T9 
* * * 

7 & 9 12 


B-D=T, 

T )*A=T> 
R-S=T3 
T3*T =T 4 
Y-Z=Ts 
X/Ts=T¢ 
T4-T6=T?7 


L-M=Tg 


©@QHA ®GFe®QHOGQO 


Tg*T 7=T 9 


(-D,B,T)) 
(*T,,A,T2) - 
(-S,R,T3) | 
(*T3,T2,T4) 
(-Z,T,Ts) 
(/Ts5,X, T6) 
(-T¢. T4,T7) 
(-M, L, Tg) 


From the preceding example it is seen that two types of algorithms 
are necessary: One is an algorithm for scanning algebraic 
expressions in Polish notation and the other an algorithm for 
scanning functional notation. 
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F. ALGORITHMS FOR SCANNING 


Right to Left Scan - Algebraic Expression | 
A right to left scan must: 


1) Add operands to stack; 


2) When an operator is encountered, output the operator, the 
two operands on top of the stack and a generated temporary, 
remove the two operands from the top of the stack, and put the 
GT on the stack. 


Left to Right Scan - Two Pushdowns - Algebraic Expression 


When an operator is encountered, it is added to the operator stack 
and a separator placed on the operand stack. When an operand is 
encountered, it is entered in the operand stack. Whenever an 
operand is added to the operand stack (this would happen when a 
generated temporary is added to the stack, as well as when an 
operand is encountered in the source program), the stack should 
be checked for TWO adjacent operands ontop. If there are none, 
the scan is continued. If there are, the operator on top of the 
operator stack is outputted along with the two operands on the top 
of the operand stack and a generated temporary. The operator 
and the operands that were output from the stacks, and the separator 
are removed. The generated temporary is added to the operand 
stack. 


Left to Right Scan - One Pushdown - Algebraic Expression 


Add the operators and operands to stack. Whenever an operand is 
added, a check is made for TWO adjacent operands on top; if not, 

the scan is continued. If there are, the TWO operands and the 
operator on top of the stack and a generated temporary are outputted. 
The two operands and operator are removed and the generated 
temporary added to the stack. 


Right to Left Scan - Functional Notation 


Place )s and operands in stack, When a (is encountered, this means 
the name in front of it is an operator or a function. This operator 
should be outputted, along with all the operands down to the first)}in 

the stack. A generated temporary should also be outputted. The 
operands and the ) should be removed from the stack, and the generated 
temporary added to it. 
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Left to Right Scan - One Pushdown - Functional Notation 


Add all operators, ({s, and operands to the stack as they are 
encountered. When a ) is encountered, output the operator which 

is below the highest ( in the stack and all the operands in the stack 
above it, Also output a generated temporary. Remove the operator 
and all the operands that were outputted from the stack, as well 

as the ( that separated them. Add the generated temporary to the 
stack. eee ; 


Left to Right Scan - Two Pushdowns - Functional Notation 


When an operator is encountered, recognized by the fact that it is 
followed.by a left parenthesis,’.add.it to the operator stack and place: 
a ( separator in the operand stack, When.an operand is encountered, 
add it to the operand stack. When a ) is encountered, output the 

operator that is on top of the operator stack, all the operands from 
the operand stack down to the first and a generated temporary. - 
Remove all of these items outputted from the stacks as well as the 
top-most ( from the operand stack. Add the generated Temporary 
to the operand stack. 


The Burroughs' B5000 uses. Polish notation with a right operator 
instead of a left, i. e., suffix notation. For instance, externally 
a value might be expressed as ((b-c)¥a)*(yft)). Expressed in 
Polish notation with a suffix operator, this value would be: 

be -a*xytf-*. 3 
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G. LANGUAGE AMBIGUITY 


Introduction 


Consider each of the following three examples. Examine for 
ambiguity. 


For a*(-b) there are two operators and two operands. The hierarchy 
is not clear as to which operation should take precedence. a*(-btr) 
is a value which may also be expressed as *a-br. Here no ambiguity 
exists, for b can very easily be subtracted from r and "a" multiplied 
by the product. The last example is a¥*(-b-d). In Polish notation, 
this would be: *a--bd, and once again a hierarchial ambiguity exists. 


Subscripts 


One possibility for eliminating ambiguity would be to add subscripts 
to the operators. Our iast exampie wouid be expressed then as: 
¥2,A-2-1 BD 


To use a more complex example, consider the following: (A*(-B))*(4K-y)). 
In Polish notation the formula would be expressed with subscripts 

as **A-]B-1-XY. Ifthen, by simple substitution, an F represented 

the *, a Gthe -, and H the -1, the following would result: 

F(F(A, H(B)), H(G(X, Y))). If the normal operands of +, -, *, /, as 

well as subscripted operators are eliminated, an input statement 

would be similar to R(S, T(M, N),Q). 


R(S, T(M,N),Q) as an input string could be processed in a manner 
similar to Figure 19. 


Figure 19 


Right To Left Scan 





™,N,T1 RS,T,Q,T2 
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With a right to left scan, )Q)N and M would appear in the stack. At 
this point, T M, N, T) would be outputed. During the second phase 


the stacker would contain )QT 1S, and would require outputing of 


RS, Tj,Q,Tz. In the third phase Ty would be alone in the stack. 
Figure 20 illustrates the same problem with a left to right scan 
with two pushdown stacks. 

Figure 20 


Left to Right Scan 














Operator Operand Operator Operand Operator Operand 
N 
M Q 
( Tl 
T S { s 
R ( R ( T2 
TN,NT, RQ,T,,S,T2 


In phase one the operator stacker would contain RT, and the operand 
stacker (S(MN, with an output of TN,M,T ,. The second phase would 
have an R in the operator stack, and an operand stack of (S T; and Q, 
and output of RQ, T,,S,T2. In the last phase the operator stack is 
empty and the operand stack contains Tp. 


A further example of elimination of all normal operators such as 

+, -, / and * would be: F(G(H,M), L, B(X), R(F,T,V;Q)). The two 
solutions to this problem include both types of scans; left to right 

with the two pushdown stacks , one for the operator and one for the 
operand (Figure 21) and the right to left scan (Figure 22). For both the 
output is written and included in functional notation. 
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iy DP WwW OO. 


M 
H 

( B 
( F 
M,H,T 
X,T, 
S,T,V,Q 
TT, 


Figure 21 


Left to Right Scan 


»T3 





Q 
Vv 
T 
s 
( T3 
T2 T2 
L L 
R Ty Ti 
( E ) T4 
3 4 5 


T3T 4. 
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Figure 22 


Right to Left Scan 


F(G(H, M), L, B(X), R(S, T, V, Q)) 


RtoL 
] R 
2 B 
3 G 
4 F 





M 
) T3 
x i ia 
) iT! Ty 
4 T T; 
) ) ) T4 
2 3 4 5 
$:73V,0;T) 
Xx,T> 
H,M,T3 


Test ToeT is Ta 
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Review of Polish Notation 


Polish notation with left.to right scan will be implemented in the 
following example: (((A*(B-D))*(R-S))-(X/(Y-Z))*(L-M)). 

The pushdown at the end of each output is shown along with the 
outputed functional notation required in Figure 23. 


Figure 23 
i 


ee] ° f-3] ; fk/y-z)] #(L-M) 


Right Polish, L“>R scan. 


ABD-* RS-* XYZ-/ - LM-* 


D S Ty 
1 R T3 x x T L Tg 
A Ts Ts Ty T, T 4. T? Ty Ty, 
1 2 3 4 5 6 7 8 9 10 
1 -D,B,T, 
2 *T),A,T> 
3 -S,R,T, 


4 #*T,,7,.T, 
5 -Z,Y,Ts 
6 /T, »X,T¢ 
7 -Te,TyT 
8 -M,L,T, 


9 *Tg,T7,Tg 
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Scan for Language Two 


Since Language Two is a higher level language than Language One, 
a scan for the former can be readily implemented in the latter. 
Figure 24 is an example of such a scan. . 


Figure 23 


Scan For Language Two 
(written in Language One) 


START: FREE (NAMST); 
LA: INPUT (SCRST, LEND); 
LB: NORM (SCRST, TA); 


TRAEQ (TA, ZERO, LA); 
REMNS (SCRST, ONE, CURSY); 
TRAEQ (CURSY, BLKSY, LB); 
TRAEQ (CURSY, SCLSY, LB); 
TRAEQ (CURSY, COLSY, COLSR); 
TRAEQ (CURSY; LPRSY, LPARSR); 
TRAEQ (CURSY, COMSY, COMSR); 
TRAEQ (CURSY, RPRSY, RPARSR); 
ADDSL (NAMST, CURSY); 
GO TO (LB); 

COLSR: CONCAT (COLST; NAMST, NAMST); 
OUTPUT (NAMST; FREE (NAMST); 
GO TO (LB); 

LPARSR: PUSH (FUNCST, LPRSY); 
PUSH (PARST, LPRSY); _ 


CONCAT (NAMST, FUNCST, FUNCST); 
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COMSR: 


RPARSR: 


LD: 


LF: 


LE: 


LI: 


FREE (NAMST; GO TO (LB); 


NORM (NAMST, TA); 


-TRAEQ (TA, ZERO, LB); 


CONCAT (NAMST, PARST, PARST); 
PUSH (PARST, COMSY); 

FREE (NAMST); 

GO TO (LB); 

NORM (NAMST, TA); 

TRAEQ (TA, ZERO, LD); 
CONCAT (NAMST, PARST, PARST); 
PUSH (PARST, COMSY); 

FREE (NAMST); 

FREE (WSTA); 

REMNS (FUNCST, ONE, SYA); 
TRAEQ (SYA, LPRSY, LE); 
ADDSL (WSTA, SYA); 

GO TO (LF); - 

CONCAT (LINKST, WSTA, WSTA); 
OUTPUT (WSTA); 

FREE (WSTA); 

REMNS (PARST, ONE, SYA); 
REMNS (PARST, ONE, SYA); 
TRAEQ (SYA, LPRSY, LG); 
TRAEQ (SYA, COMSY, LH); 


ADDSL (WSTA, SYA); 
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LH: 


LG: 


LJ: 


LEND 


GO TO (LI); 

CONCAT (COMST, WSTA, WSTA); 
OUTPUT (WSTA); | | 
FREE (WSTA); 

GO TO (LI); 

CONCAT (COMST, WSTA, WSTA); 
OUTPUT (WSTA); 

FREE (WSTA); 

NORM (FUNCST, TA); 

TRAEQ (TA, ZERO, LJ); 

GENT (SYA); 

PUSH (PARST, SYA); 

PUSH (PARST, COMSY); 


ADDSL (WSTA, SYA); 


_ GCONCAT (RESST, WSTA, WSTA); 


OUTPUT (WSTA); 
FREE (WSTA); 

GO TO (LB); 
RESTEMP (DUMMY); 
GO TO (LB); 


STOP (IDENT); 


4] 


In order to further clarify Language Two, it will be used in the last 
example of this section showing the contents of the main string, the 
functional strings, the parameter strings, and the output at each of 


the five stages in Figure 25. The example is: 
L: AB(DE(M, NT] )2, FG(RQ3)4)5; 


Figure 25 


L: AB(DE(M, NT)-FG(RQ)); 





Func. String Parameter Str. 





ame Strin . Output 







AFUNC LEDE 
4PARt NT 
4PARF M 

4RES¢ T 1 







eae 


—~r* ~_ ~ 









4{FUNCLEF 
4PARF RQ 
4RESE T, 


™ Wem O A a hac TPT 


{FUNCLIEAB 
APARt Tz 
APARF Ty 


FREE 
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Language Three 


Language Three uses an infix notation. Having the statement 

X = A¥(B+R)/Q, in XTRAN, the equal sign is replaced bya® . 

This is the type of notation that is used for XTRAN on the 7090, 
1620, 1401 and 1410. In ALGOL a statement similar to the following 
three will exist: 


1. L:X€-A*(X-Y): 
2. IF A)B, THEN X©Y;S€T. 
3. IF B THEN Ej, ELSE Ep; E3. 


The last statement will be executing either E, and E3 or E,, E3 


At this point, a similarity between the statements used in XTRAN and 
those used in ALGOL can be recognized. : 


XTRAN is a dialect of ALGOL. Almost all languages, FORTRAN, 


COBOL and ALGOL use an infix notation so that each operator is 
unique, thus eliminating ambiguities. , 
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H. COMPILER LOGIC 


Generally, there are five phases to compilers. They are: 
Phase 1 - HTR - Hardware To Reference 
Phase 2 - Pre-scan 
Phase 3 - Scan 
Phase 4 - Macro expander 
Phase 5 - Assembly program 


Phase 1] ~ Hardware To Reference, converts the external 
representation to the internal representation. With the example 
A*B, referring the Dictionary of Internal Identifiers for Symbols 
(Figure 26) a transformation is made into 178 52 180. The internal 
representation has 0 through 177 as operators, 178 through 239 as 
alphabetic characters, 240 through 254 as integers, and 1, 000 
through 4,999 as names. Before leaving the HTR phase, consider 
a more complex example that can be followed through the subsequent 
phases. The example: RSA. LA. RB. UP. RSA would be translated 
as: 214 218 178. LA would be translated as 80; RB as 214 180; 
and UP as 35. Therefore, leaving the hardware to reference phase 
would be the string of numbers: 214 218 178 80 214 180 35. 


Phase 2 - The pre-scan converts the names to single integers and 
also eliminates ambiguity within parenthesis. 214 218 178 would, 
in this phase, be grouped together and given an arbitrary number, 
such as 1,000. 214 180 would also be grouped together and given 
another arbitrary number suchas 1001. The string 1000 80 1001 35 
would have been developed. 


Phase 3 initiates another scan; Phase 4 is the macro expander which 
includes such things as multiply; and Phase 5, the final assembly. 
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Figure 26 


Dictionary of Internal Identifiers for Symbols - 7090 XTRAN 


I for Internal 


pyrene ae 


External 


Internal 

Representation Representation 
000 . NUL. 
001 - COMMENT. 
001 ' 
002 . ARRAY. 
003 . SWITCH. 
004 . INIT. 
004 . INITIAL. 
004 . INITIALIZE. 
005 . SWITCHVAR. 
008 . REAL. 
009 . INTEGER. 
009 . SYMBOL. 
010 . LOGICAL. 
010 . BOOL 
011 . STRING. 
016 SC, 
019 COL, 
019 . LAB. 
022 . TLAB. 
024 . THEN. 
027 .- STEP. 
028 . WHILE, 
029 . TILL, 
030 . REPEAT. 
032 { 
034 . LB. 
035 . UP. 
036 . BEGIN. 
045 . UM. 
045 . UMIN. 
046 . AB. 
048 + 
050 . DEFINE, 
051 . MD. 
052 * 
054 - 
055 / 
064 Fi 
065 . RB. 
066 ) : 
069 . END. 
070 . XB. 


45 


Comment 


Null symbol 
Comment symbol 
Comment symbol 
Declaration 
Declaration 
Declaration 
Declaration 
Declaration 
Type declaration 
Type declaration 
Type declaration 
Type declaration 
Type declaration 


Type declaration 
Semicolon 

Colon 

Colon 


Until 


Left parenthesis 
Left bracket 
Exponentiation 
Left brace 
Unary minus 


Absolute value 
Modulo 
Right bracket 


Right parenthesis 
Right brace 


I for Internal Internal 


Symbol Representation 


073. 


073 
073 
074 
075 
075 
076 
076 
077 
077 
080 
080 
082 
083 
084 
085 
086 
087 
096 
097 
098 
098 
099 
099 
099 
102 
102 

I 104 
105 
112 
112 
112 
114 
115 
116 
116 
117 
119 
119 
121 
121 
121 
125 
126 
127 
127 


External 


Representation 


- BLANK. 


_ AN. 


. PROCEDURE. 


. OP. 

. OPOP. 

.- LC. 

. LCOP. 

- RETURN. 
-RTN. 

. LA, 

. AS, 


. UE. 

.GT. 

. GE, 

. LT. 

. LE. 

. NOT. 
-OR, 

. LOG*, 

. AND. 

. IDENTICAL, 
. IDENT. 
-IDT. 

. IMPLIES. 
. IMP. 

. COMPA, 

. COMPUTE, 
. GOTO. 
.GO TO. 

- GO. 

. FOR. 

IF. 

. PARSEP. 
. PSEP. 

. ELSE, 

. LOOP. 

. DO. 

.RA. 

$ 

. EQCO. 

. BSOS. 

. ESOS 

. MACSEP. 
. MSEP. 
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Gomment 


Blank 
Blank 
Blank - 


Operation 
Location 


Return 
Return 
Left assign 
Left assign 
Relational 
Relational 
Relational 
Relational 
Relational 
Relational 
Logical 
Logical 
Logical 
Logical 
Logical 


Logical 
Logical 
Logical 


Parameter Separator 
Parameter Separator 


Right assign 
Right assign 
Right assign 
Begin SOS 

End SOS 

Macro Separator 
Macro Separator 


I for Internal 
Symbol , 


Re oS St be Bee ee 


ben Bi eo ce ee ee eB ee noe ce oo ee ee ee 


Internal 


Representation 


128 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 
158 
159 
160 
161 
162 
163 
164 
178 
180 
182 
184 
186 
188 
190 
192 
194 
198 
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External 


Representation Comment 


. THENA. 
.CM. 

.F. 

.P. 

.N. 

Xx, 

- XB. 

A. 

CG, 

- YIELDS, 

. YIELDSA. _ 
. OTHERW. 
. OTHERWISE, 
. FORA. 

. FORB. 

. FORC. 

. FWHILE,. 
. INFOR. 

. LOOPE, 

- PAR. 

- RES. 


FL. 
. SWIL. 


STOOD 


I for Internal 
Symbol 


Internal 


Representation 
wwbresentation | 


200 
202 
204 
206 
208 
210 
212 
214 
218 
220 
222 
224 
226 
228 
230 
232 
240 
24] 
242 
243 
244 
245 
246 
247 
247 
249 
255 
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CON OAR ON MONK KES IHODDYOZ SE 


External 


Representation 
epee feentation 


- EOP. 


Comment 
ee 


(End of Program) 


1620 Scan: 


The 1620 scan is from left to right. The formula A+B*X+Y appears 
in the first phase with the operand stack containing AB, the operator 
stack +*, This will output the statement: *X,B,T . The termination 
of the second phase finds the operator stack with +, and the operand 
stack A Tj, with output of + Tj, A,T2. The third phase would have 
in the operator stack + and the operand stack T2Y; and would have 


outputed +Y,T>,T3. 


Figure 27 
3 B * 2: T 3: Y 
A + A + T + 
*xX, B,T] +T1,4,T2 +tY,T2,T3 
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I. SWITCH METHOD/SINGLE ADDRESS OUTPUT 


Taking the formula X+Y+Z+R*S+B; the following output would be 
derived, turning a switch on if there is something in the accumulator, 
and a switch off if there is nothing in the accumulator. Initially, 
-nothing is in the accumulator (See Figure 28). The operand stack 
would contain XY with a + in the operator stack. Output from this 
phase would be CLAX,ADD Y. 


The second phase would hold Z in the operand stack, + in the 
operator stack with the switch ON. The output is ADD Z. 


The third phase contains R in the operand stack with + and * in the 
operator stack. Since the switch is ON, the output would be STO Ty 
which stores T, in the bottom of the operand stack. 


. During phase 4 the operand stack holds T, RS and in the operator 
stack +* with the accumulator switch OFF. The output is 
CLA R MULT S and the switch is turned ON. 


For phase 5 only T, will appear in the operand stack and + in the 
operator stack. Output would be ADD T with the switch turned ON. 


The last phase would have B in the operand stack and + in the operator 
stack. Output is ADD B with the switch ON, STO. 
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1: 


OFF 


OFF 


Figure 28 


Operand Operator Operand . Operator Operand Operator 


Y i2: ' 13: 
x + | |< | + | 


1 | te a ay. | 








ee 





CLA X ON ADD Z ON STO Tl 
ADD Y 
Operand Operator Operand Operator Operand Operator 
S SS 6: 
| R | ed | 1 4 
Palle | Ls fh | Ledhe | 
CLA R ON ADD Tl ON ADD B 
MULT S STO... 
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J. FORCING CODES AND ROUTINES 


Introduction 


The 1620 XTRAN Compiler uses the technique of forcing tables for 
compilation purposes, Since the hierarchy of operators must be 


_ free of all ambiguities, one method of elimination would be placing Operator 
a numeric value on each operator and devising a rule for forcing 
the operator. For instance, assigning the value of 10 to + and 5 to * + 


for the example X+Y*Z with the pointer at *, the multiplication would 
not be forced. s 


% 


In the example X * Y + Z the multiplication should be forced and in 
X+ Y+ Z all should be forced. 


/ 
These rules could be condensed into the single statement of: force 
the operator if the right value of fr is greater than or equal to the * 
- less value of PL- By consulting a predetermined forcing table similar 
to Figure 29, ambiguities would be eliminated from the language. = 
Forcing routines are included in Figure 30. 7 
< 
Zz 
> 
< 
-OR 
AND. 
u 
- NOT. 
K— 
. GOTO. 
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Left 
Value 


10 


10 


12s. 
12 
12 
12 
12 
12 
20 


15 


14 


60 


50 
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Figure 29 


Forcing Table 


Right Force 

Value _ Code Operator 

10 1 . COMPUTE, 
10 1 . COMPA, 

5 1 . WHILE, 

5 1 IF. 

4 1 . THEN. 
12 1 . THENA. 
12 1 . ELSE, 
12 1 .N. 

12 1 [ 
12 1 J 
12 1 A, 
20 1 .C. 
15 1. .F. 

0 2 «Ps 

0 2 YF. 

59 3 YP. 
.CM. 

1 6 . FOR. 
50 0 . FORA, 

0 6 . FORB, 
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. Left 


Value 


60 
60 


49 


50 
50 
50 
50 


50 


59 
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Right 
Value 


60 


60 


49 
50 


Force 
Code 


ll 
12 
13 
14 
16 


18 


21 
22 
23 
24 


25 


36 
37 
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Left Right Force Left 


Operator Value Valve Code Operator Value 
} 0 65 0 . FORC. 60 
; . 60 60 7 - LOOP. 60 
name 0 
Figure 30 


Forcing Routines 


Force Code 1 (for binary operators) 
STO: &, N, &, N, 48 RQ (Original string) 
STN: x, Ti a Xo (New string) 
STP: 4 €, NaN Ty (Output string} 
Replace N, f, by T; in the string. 
Force Code 2 (for unary Sceiean 
STO: © Ny fag Xz 
STN: x, T; PQ m2 
STP: 4%, - Ny 5 Tj 
Replace i N, be, T; in the string. 
Force Code 3 (for €—} 

STO: ay Nit, N.Y ma 

STN: «4 Ny Ong 2 

STR Phy Ns 


Remove N, c; from the string. 
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Right 
Value 


60 


Force 
Code 


41 
45 


0 


Force Code 4 (for :) 
STO: 
STN: 


STP: 


Force Code 5 (for GOTO) 
STO: 
STN: 


STP: 


Force Code 6 (for ( and { ) 
STO: oy €x4 Ny Fag PED 
STN: «, Na,Xa 
STP: nothing 


Remove t. and ep from the string, outputs nothing; moves pointer 
back one. 


Force Code 7 (for ;) 
STO: Ny 4 oy fp Xs 
STN: s, x, Yang 3 
STP: nothing 


Remove N, . from the string. 


Force Code 8 (for COMPUTE) 
STO: 
STN: 


STP: 55 


Force Code 9 (for COMPA) . ; Force Code 13a (for THENA when Cis ELSE) 


STO: STO: « , Gt; €, Ny Ng Ma. 
STN: : 

STN: vy GLs Oma 
STP: STP: 4 Thast Gi; , 64; 


4 TLABk means TRA to €L* with the label G L-attached to the 
J ! 


location following the TRA instruction, 
Force Code 10 (for WHILE) 200 owing instruction 


Replace GL; ¥ Ny by GL; in the string. 


STO: 
STN: 
STP: Force Code 13b (for THENA when ee is not ELSE but ;,4, etc.) 


STO: x, GL, N, Tog a 
Force Code 11 (for IF) SENS Pi 4 Ny Yor Aa 


STO: ><4 5g <5 WAR 3 STP: 4 LABELEGL; a) 


2 - OL. : f = ‘ 
STN: & a fag %3 RemoveGi; *, from string 
STP: nothing 


t . Force Code 14b (for ELSE when, is not THEN) 
Remove a7 from string. —_— R —— 
STO: of, Gi. N, Wan 2 


STN: x ,N, Tg OS 


STP: JiaBet | EL; 


Force Code 12 (for THEN) 
STO: ox, N, f ™~ > Gog ~ 3 
STN: oy GL; THENA >, Qo p72 Remove Gi: Bh from the string. 
sTP: 4 4. b N,,GL; 
4 THEN + is TRA toGl: if N,is false. 


Replace N, “| by GL; THENA in string. 
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Illustrative Solution 
Consider the problem of: 
COMPUTE At A+Bf2 WHILE A ¢ 100 
with a general form of: 
COMPUTE S$ WHILEC 
A folution for the problem by developing a forcing table Origuze 31); 


applicable codes (Figure 32) follows. An example for purposes of 
proof is also included as Figure 33. 


Figure 31 


Force Table 


LV RV FC 
COMPUTE 0 60 8 
WHILE 60 59 10 
COMPA 60 60 7 
Figure 32 


Force Codes 


Force Code 8 - COMPUTE 
STO: >, oe N,, 4g 4 
STN: oc, GL, COMPA, fag a 
STP: J LABEL GL, 


Replace f, with GL , COMPA 
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Force Code 9 - COMPA 
STO: oN, FON, ar 2 
STN: x,t Xx 3 

STP: 4 COMPAL 
stp: 4 f,i NaN, 


Removef,N, N, from string. 


Force Code 10 - WHILE 


STO: &, N, AON, t2%2 


STN: &, Nag fx Xa 


STP: none 


Removes N, vo from string. 


Figure 33 


Proof 


COMPUTE A A+Bf2 WHILE A<100; 


General form: COMPUTE =WHILEC 


A©A+Bh2 
S_ 
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COMPUTE ATATBI2 WHILE AC 100; 4 LABEL}I GL, iF IMIN < SL 


GL, COMPA At A+Bf2 WHILE A < 100; a . 
1 COMPA At At < 1th B.2,7, THEN GOTO LCV00 $:LSE LVSL < LV [su] ; 


GL , COMPA ACA+T, WHILE A“ 100; A,T,, | 
: | WHI < d++ A,T,,T2 IF IMIN ¢ SR 


GL ; COMPA At-T, WHILE A 100; ish AT, THEN RVSR€0 ELSE RVSR €- RV [sR | 
GL , COMPA T, WHILE A¢ 100; iC} Ayi00;T, IF LVSL‘ RVSR 
GL,COMPAT,WHILET,; 4 +ACOMPAF T;,GL, THEN CVSL <— CV [sx] 
4 COMPA will generate coding to test T3 and branch to GL, if true, ELSE CVSL <0; 
i.e. TRATRUE T3,GL}. IF CVSL = 0 THEN GOTO LCV00; 
XTRAN Compiler | GOTO BRANCH [CvsL] ; 


With this background, an exerpt from an EXTRAN compiler may 
be examined in Figure 34. 


Figure 34 


Excerpt From XTRAN Compiler 


SWITCH BRANCH < (LCV01, LCV02, LCV03, LCV04, LCV05,..., 
LCV50); 
Ll GETXT (SR, LA); 
ADDSL (WST, SR); 
L5 NWST€— NORM (WST); 
IF. NWST< 3 THEN GOTO LI; 
IF PTR< 3 THEN PTR 3; 
IF PIR NWST THEN GOTO LI; 
L6 SL€-GETNS (WST, PTR-2); 


SR<GETNS (WST, PTR); 
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_K. BACKUS NORMAL FORM 


John Backus, who wrote the first FORTRAN, developed a method 

of expression called Backus Normal Form, This mode of expression, 
or alphabet, is used in ALGOL. The alphabet BNF is probably the 
most exciting development in metalanguages today (a metalanguage 
is a language to describe another language). 


Since the syntax of ALGOL is defined in BNF, the symbol definition 
should be examined, The alphabet of BNF consists of: 


€) class of objects 
: : = defined 
I or 


So that a statement would take the following formats: 


(letter) :: = A[B|C/D... |z 
(operator) 2 Sti-|é |/|* 
{operand} = Aletter) | operator) {operand} (operand) 


(expression) : 2 = <operand} 
The syntax of ALGOL, expressed in BNF, would be of the format: 
letter, {nam letter | digit 
q aK & 


{letter>| (digit) 


{name ) 
(legit ) 
(name) 


tt 


H 


{iette »| {name) digit) 


(operand) : := <letter) | (function) 
(function) : := etter) (4ist)) 
dist) = {operand) (list) ; (operand) 
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Ill. HEURISTIC COMPILERS 
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A. INTRODUCTION 


Need 





During the past decade there has been enormous progress in the 
development of higher level programming languages for instructing 
computers, Through the development of FORTRAN and ALGOL, 
COBOL, and the list processing languages of IPL and COMIT, the 
labor of programming has been reduced several orders of magnitude. 


Yet, programming a computer to perform a complex task, is still 
very much more intricate and tedious than instructing an intelligent 
and trained human being for the same task. The intelligent and 
knowledgeable human does not have the literalness of mind that is so 
characteristic of the computer, and so exasperating in a programmers 
interaction with it, He supplies facts from his own store of knowledge 
that his instructor neglected to give him. Given statements of 
objectives and broad functional terms he can apply to his problem 
solving powers to filling in the details of method. Meaning and intent 
are interpreted when he is confronted with the vagueness and 
informality of natural language. 


Experiments in heuristic compilers are aimed at further bridging the 
gap between the explicitness of existing computer programming 
languages and the freedom and flexibility of human communication. 


Definition 


Intelligent problem solving, whether by man or by machine, implies 
selective rather than just rapid behavior. Humans achieve this 
selectivity through heuristicseprincipals that, on the average contribute 
to a reduction of search for problem solving. Heuristic programming 
then, is the construction of computer problem solving programs 

whose behavior is similarly organized. 


Implications 


There are several implications of the definition of heuristic 
programming that should be noted: a concern with exploiting partial 
information in a problem situation where there is no guaranteed way 
of using that information to find a best solution; preparing a 
procedure (expressed as a computer program) to make effective use 
of this partial information; and it is a body of knowledge built up 
through experience with specific examples lacking as yet an 
underlying analytical framework. 


Using these implications as a guide, a definition of heuristic 

programming could be: many decisions which are made in an 

environment which may be characterized by: 1) lack of a feasible 
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guaranteed method of reaching the best solution; 2) a description of 
the solution in terms of acceptability of characteristics; and 3) 
sufficiently many possibilities to make complete trial and error 
infeasible. Heuristic programming is concerned with constructing 
decision paradigms (computer programs) emphasizing the use of 
selectivity rather than computational speed, 


Applications 


A heuristic program has been prepared to balance production of 
assembly lines ina factory. Balancing the line is the act of assigning 
jobs to workers, so that a given production rate can be met with 
minimum men. The Behavioral Theory of the Firm Project at 
Carnegie Institute of Technology is developing a simulation of the 
price and quantity decisions of a department store buyer; a heuristic 
program is now being prepared to simulate investment activities 
under a trust fund; and while no formal result has yet been published, 
two separate attempts to construct heuristic programs for the 
job-shop scheduling problem are under way. 


Design Representation 


One distinction between the restricted, relatively simple task of 
"coding" and the more difficult task of "programming" is that the 
latter may encompass the selection or design of an appropriate 
problem representation, and the former does not. Consideration 
must be given to the requirements in the design or selection of a 
representation, and what is needed to give the heuristic compiler 
the capability and capacity to grapple with such design and selection 
tasks, 


Designing a suitable representation, is in effect finding an 
isomorphism. A description was defined in terms of certain elements, 
relations between elements and processes. The programmer has to 
find a set of elements, relations and processes defined in a heuristic 
compiler that are isomorphic with the required elements, relations 
and processes (i.e., the proper sub-set). 


Iil-structured Problems 


An ill-structured problem has generally three characteristics; 

(1) many of the essential variables are not numerical at all, but 
symbolic (2) the goal is vague and non-quantative, and (3) computational 
algorithms are not available. 


Allocating marketing expenditures among sales and promotional 


efforts, preparing a compiler for business applications, playing chess, 
are ill-structured, if not perverse, problems. 
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Currently researchers are studying human methods of dealing with 
ili-structured prcblems. Theorm proving in geometry, chess, 
human discrimination, learning, human concept formation and 
language translation are all being observed. These problem solving 
activities have been called "heuristic problem solving", to emphasize 
the importance of principles or rules of thumb that tend to discover 
acceptable solutions more efficiently in most cases than do exhaustive 
methods. 


Pre-structuring 


Present digital computers require an explicit statement of the 
procedure to be carried out and of the data to be processed. But 

at any state in developing a heuristic procedure, knowledge of the 
problem itself is incomplete. Methods of describing the problem 

and procedures for solving it demand easy revision of both procedures 
and data. 


The common answer to this dilemma is to pre-structure the problem 
environment, and then within the resulting and often stringent 
requirements, produce a solution procedure. The danger of this, 
however, lies in the loss of flexibility thus introduced. 


Language Characteristics 


Certain requirements for a language to express ill-structured 
problems are; 1) freedom from memory allocation problems; 

2) the ability to define and redefine complex concepts; 3) to make 
use of these concepts by name in defining further concepts; 4) the 
ability to express concepts which are not meaningful until some 
problem solving has occurred; 5) the ability to introduce useful 
notations; 6) the ability to associate information in an easily 
recoverable manner; and 7) the ability to change sections of the 
decision process independently without problems of interrelations 
with other sections, 


A number of computer languages have been developed which begin to 
provide some of this desired freedom. The IPL series, LISP, 
FORTRAN, and COMIT, exemplify this approach to computer 
utilization. 


The specific features of the IPL series include: (1) organization 

of storage into list structures; (2) use of the description processes 
to associate with structures new information or to delete previously 
associated information; and (3) hierarchial nature of control which 
allows for both a natural hierarchial organization and specification 
of the processes, and for recursive definition. 
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B. INFORMATION PROCESSING LANGUAGE V 


Introduction 


IPL-V is designed for list processing and symbol manipulation, the 
fifth of a series of languages. Its heavy use is among scientists 

in the fields of artificial intelligence and simulation of cognitive 
processes. 


Compiler Capabilities 


The compiling routines of IPL-V accepts the task of writing heuristic 
programs on the basis of certain information provided to it. The 
routines differ with respect to their methods of formulating or 
representing the problem, for each uses a different state language. 
At present, three compiling routines exist: 1. the SDSC (U140), 
DSCN (U 134), and General Compiler (U135}, 


The SDSC Compiler (State DeSCription) accepts as an input a 
description of the contents of the relevant computer cells before 

and after the routine to be completed has been executed. It produces 
an IPL-V routine that will transform the input state description 

into the output state description. 


The DSCN Compiler (DeSCriptive Name} employs as its input an 
imperative sentence definition of the routine to be compiled. DSCN 
produces an IPL-V routine that is a translation, in the interpretive 
language, of that definition. 


The third compiler, the General Compiler, is an executive routine 
that can use the SDSC, the DSCN and other sub-routines. The input 
information can be stated in any form of several representations, 

i. e., those appropriate to SDSC or DSCN, and selects sub-routines 
to produce the desired IPL-V code. 


From a logical standpoint the Heuristic Coder could be described as 
a single program whose executive routine is the General Compiler 
and which contains the SCSC compiler and the DSCN compiler as 
sub-routines. 


In the IPL-V language cells may have lists associated with them, and 
the primitive processes find their operands in a communication cell 
and its related list. The communication cell is also called an 
accumulator because it has many of the functions of the accumulator 
in a standard computer. Processes with the exception of tests, 

place outputs in the accumulator and in its pushdown lists. Tests in 
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IPL-V record their result by placing a plus or minus ina special cell 
called the Signal Cell. ; oe 


Lists in the IPL-V memory may have description lists associated with 
them. A description list consists of pairs of symbols in which the 
first symbol of each pair designates an attribute, the second symbol . 
designates the value of that same attribute. The value may bea 
simple symbol, or it may be itself a-list. : 


Values of attributes of objects may themselves have descriptions. 
Representations of routines or programs in memory will take the 
form of description lists, each routine having one or more of the 
attributes of IPL name, IPL definition, functional description, state 
description and flow diagram. The values of these attributes will 
themselves be described, or have description lists associated with 
them. 


The SDSC Compiler 


A computer routine can be defined by specifying the changes it produces 
in the storage location it affects, or by specifying the before and after 
conditions of these storage registers. . A definition of this kind is not 
univocal, There will generally be many programs not all equally 
efficient or elegant that will do the same work, and the SDSC Compiler | 
attempts to find one to accomplish a given task. 


The DSCN Compiler 


Instead of specifying the before and after condition of the computer 
cells a routine is designed in terms of a function it performs. The 
DSCN Compiler searchs a list of available (compiled) routines to 

find one whose DSCN in as similar as possible to the DSCN of the 
routine to be compiled. Secondly, an analysis is performed to transfer 
the compiled routine that has been found into the new routine. When 
the compiler finds differences, it searches for an operator relevant 

to removing the differences, and the resulting program will be identical 
with that obtained by the SDSC Compiler. 


The General Compiler 


The General Compiler is an executive routine whose task is to compile 
a routine from information in any of the forms already described 
(SDSC and DCSN). It takes as its input the name of the routine to 

be compiled. Associated with this name (on its description list) is 

the information to be used on the compiliation. ; 
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A ROUTINE is a description list containing values of some subset of 
the following attributes; 


1, IPLN, IPL Name (X25). The value of this attribute is a 
description list that names a region and a location in the 
region, 


2. JDEF, IPL-V definition (X22). The value of this attribute 
is list of IPL-V instructions. Each is in the form of a 
description list which describes the corresponding IPL-V 
word that defines the IPL-V routine with specified names. 


3. DSCN, DeSCriptive Name (X20). This is an imperative 
sentence encoded as a list structure that describes the 
process assigned by JDEF. 


4. SDSC State DeSCription (X240). The value of this attribute 
is a list structure that describes the state of the IPL 
computer before and after the routine in question has been 
executed. Only changes are mentioned explicitly. 


5. FLWD FLoW Diagram (X267). This is a list structure 
that gives a flow diagram corresponding to the JDEF, or 
job definition. 


6. ASOJ ASsOciated J Definition (23). The value of this 
attribute is the IPL name of a routine associated ina. 
manner to be described later with a given routine. 


A compiled routine is a routine that has its JDEF, or an IPL-V 
definition. The problem of compiling a routine can be stated as 
follows: Given a routine without a JDEF (the present object or 
definition) find the corresponding routine with the JDEF (the goal' 
object), where "corresponding" means that the compiled routine 
has the same SDSC, or DSCN, as the given routine. 
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C. COMIT 


COMIT is a problem oriented language for symbol manipulation, 
especially designed for dealing with strings of characters. Facilities 
are available for easy re-arrangement, insertion or deletion of 
characters of strings, and complex searches involving pattern- 
matching in terms of incomplete descriptions of in terms of context. 
A facility for dictionary lookup and flexibility input and output 
facilities are provided, 


COMIT was originally designed for mechanical translation and 
linguistic research to provide the linguist who had no previous 
programming training with immediate access to a computer. It is 
based on Chomsky notation that is familiar to linguists, with a number 
of additional features added making it a useful and convenient 
programming language. COMIT is not a language in which one 
"programs in English''--it is a highly abbreviated notation. 


The COMIT system, as implemented for the IBM 709 and 7090, 
consists of about 16,000 instructions. It compiles the source 

program into an internal language which then runs interpretively at 

a high level. This program was SHARE distributed in September 1961, 
and was programmed at the Massachusetts Institute of Technology by 

a joint effort of the Research Laboratory of Electronics Mechanical 
Translation Group and the Computation Center. 
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D. SAINT 


Introduction 


The IBM 7090 was programmed to solve elementary symbolic 
integration problems at approximately the level of a good college 
freshman. "SAINT", an acronym for Symbolic Automatic INTegrator, 
performs indefinite integration, and definite and multiple integration 
when these are trival extensions of indefinite integration. It uses 
many of the methods and hueristics of students attacking the same 
problems. 


Pattern Recognition 


Pattern recognition plays an important role, for it consumes much 
of the program and programming effort. It is used frequently and 
with great variety in determinations involving standard forms, 
algorithm-like and hueristic transformations, and relative cost 
estimates. Finally, it consumes much of the time in solving 
integration problems. 
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E, GIT 


GIT, an acronym for Graph Isomophism Tester, has been written 

in the COMIT language described in Section C. The graph isomorphism 
problem may be stated as follows: Given two direct line graphs 
determine whether or not they are isomorphic, and if they are, 

specify a transformation carrying the first graph into the second. 


The problem of ascertaining whether a pair of directive line graphs 
are isomorphic is one for which no efficient algorithmic solution is 
known. Because a straight forward enumerative algorithm might 
require 40 years of running time on a high speed computer to compare 
2-15 node graphs, a more sophisticated approach is required. The 
Graph Isomorphism Tester incorporates a variety of processes that 
attempt to narrow down the search for an isomorphism with no one 
scheme relied upon exclusively for a solution. The problem is 
designed to avoid excessive computation along fruitless lines. 


Another program of the same type has been written at the Harvard 
University Computation Lab and is a more generalized version of the 
same graph isomorphism problem. Its' quite similar in approach 

to GIT and it has been implemented with the 7090 assembly language 
rather than COMIT. 
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F, LINE BALANCING 


Introduction 


A hueristic program for assembly line balancing has been developed. 
The assembly line balancing problem, like many combinatoriai 
problems, has not been solved in a practical sense by advanced 
mathematical techniques, and this approach does not guarantee an 
optimum solution. The ultimate measure of a hueristic program is 
whether it provides better solutions more quickly and/or less 
expensively than other methods. 


Problem Definition 


Given a production rate (or equivalently, a cycle time) the minimum 
number of work stations (or operators) consistent with the time in 
ordering, constraints of the product should be developed. The 
assembly line balancing problem concerns a set of elemental tasks 
where each requires a known operation time per unit of product 
independent of when performed, and where partial ordering exists. 


A optimum of solution of the problem consists of an assignment of 
elemental tasks to work stations so that each task is assigned to one 
and only one work station; the sum of the times of all tasks assigned 
to anyone station does not exceed some pre-set maximum of cycle 
time; the generated stations can be ordered such that the partial 
orderings among tasks are not violated; and the number of work 
stations is minimized. 


Some difficulties associated with line balancing are determination 
before solution the minimum number of operators, minimization of 
the variation in work load among stations in evaluating possible 
solutions; and juxtaposition of a zoning constraint. (Zoning is the 
division of the set of elemental tasks into overlapping subset 
corresponding to physical constraints on the assembly operation). 


Zoning of an assembly line may be determined by the position of the 
product on the conveyer, the layout of the production facility, or 
both. 


Hueristic Procedure 


The hueristic procedure for assembly line balancing is: 
a) Repeated simplification of the initial problem. This is 


accomplished by grouping adjacent elemental tasks into 
compound tasks, 
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b) Solution of the simpler problems created by assigning 
tasks to work stations at the least complex level possible. 
The compound tasks are broken into their elements only 
when required for solution. 


c) Smoothing the resulting balance. Tasks are transferred 
among work stations until the distribution of assigned 
time is relatively even. 


Regrouping Procedures 


Five regrouping procedures were used in the hueristic line balancing 
procedure: direct transfer, trading, sequential grouping, completing 
grouping and exhaustive grouping. Direct transfer was limited to 

two component involvements. This method transferred elements 

from one component to the other and by this means reduced the number 
of men required by a straightforward totaling of whole men. . Trading 
also applied only to two components and assumed that direct transfer 
had been attempted without success. Trading regrouped by shifting 

an element larger than the acceptable limit from one component in 
exchange for smaller elements (in a set relationship with the first 
element shifted) from the other component. Sequential grouping is 
exploited for several components, and its procedure is to construct 

an acceptable work station from the front of the given group of IV. 
components, 


The remaining two regrouping procedures, complete grouping and 
exhaustive grouping attempt to completely solve the remaining sub- 
problems, and are "last ditch" methods at any particular level. Complete 
grouping attempts to construct work stations until all task elements 
are grouped from the front and (if required) the back of the component 
group. If at any time the method cannot construct the station (i. e., 
the remaining elements total less than can be handled by the remaining 
men) the method fails. Exhaustive grouping generates all possible 
first work stations, then all possible work stations following for each 
of the first stations. The comparatively large amount of effort 
required to do an exhaustive grouping dictates that this procedure be 
used when only two men are assigned. 


Conclusion 


The hueristic line balancing procedure is not economically competitive 
when measured against the dollar-per-hour cost of line balancing by 
the industrial engineer, but a true evaluation of the method should 
consider: the possibility of averaging fewer men required along the 
line, the value of quick production of balances at a large number of 
production rates; and the value of releasing industrial engineers to 

do other, more creative analytic work. 
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SYMBOL MANIPULATIVE LANGUAGES 
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A, ALGOL 


ALGOL was designed by an international committee. It is a language 
for use in scientific type problems hence would compete with FORTRAN. 
It was designed to have all the features that a user might want, and 
practical considerations of implementation were not given much 

weight. This contrasts with FORTRAN, which was designed with the 
IBM 704 in mind, and which had artificial restrictions so that efficient 
object code could be produced. (ALGOL was designed with fast 
compilation in mind, but with little emphasis on efficient object 
programs. ) 


The first version of ALGOL was produced in 1958 and the second in 
1960. The 1960 version had the more extreme features, and it is this 
version that will be discussed. 


Although IBM is not producing any ALGOLs, it is of concern in 
competitive situations. Burroughs, CDC, UNIVAC, and RCA are all 
producing ALGOLS (none of these contain all the features of ALGOL). 
ALGOL is very popular at universities, with Duke, Michigan, and 
Princeton all having ALGOLS, The languages MAD , JOVIAL, NELIAC, 
and XTRAN are all similar to 58 ALGOL. The "intellectuals" support 
ALGOL. Many articles concerning it appear in the literature. It is 
especially popular in Europe, since the universities are more 
influential there. McCracken is publishing an "Introduction to ALGOL" 
which should increase its popularity here. Much of the published 
literature is difficult to follow, but this probably will not be true of 
McCracken's book. 


One of the reasons that IBM is not implementing ALGOL is that 

SHARE seems content with FORTRAN, which is being constantly improved. 
{IBM did cooperate with SHARE in producing a large part of an ALGOL 
compiler for the 709/7090). Both IBM and its customers have a large 
vested interest in FORTRAN because of existing compilers and library 
programs. (Our competitors pruduce FORTRAN compilers also). However, 
pressure will probably continue because ALGOL is more powerful 

than any existing or proposed FORTRAN. 


In looking at ALGOL, two aspects should be borne in mind. The first 
concerns its more standard features, which in many cases are 
extremely desirable, and the second its rather extreme features, 
which tend to make it impractical. 


There are three forms of the ALGOL language. The reference 
language is the one used in most publications, and the one we will 

use. The hardware language is one that each manufacturer determines 
is to be used on his equipment. The publication language allows things 
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like putting subscripts below the line and exponents above the line. 
We will not discuss this. 


ALGOL uses a number of special operators such as GOTO, IF, THEN, 
DO. These are to be considered as though they were Single characters. 
A hardware implementation might be to use an "escape symbol" such 
as a period, which would be keypunched as .GOTO. .IF. .THEN. .DO. 
This paper will use underlines, although in the journals they are 
indicated by boldface. Similar to these are symbols such as 

which might be keypunchedas .GE. .LE. .GT. This is different 
than the reserved word idea used in COBOL. With the scheme 
mentioned in THEN, GE, etc. can be used freely as names in the 
program. The only restrictions on names are for a group of eight 
commonly used mathematical functions such as sin, sqrt, etc. 


Operators are in general not ambiguous in ALGOL. For assignment 
the := is used, reserving the symbol = for equality. For example, 
A:=B means to take the value of B and store it in A, while IF A=B has 
the obvious meaning. F(X) means to execute the function F using X as 
an argument. On the other hand, the Ith element of an array A is 
indicated by A [asf FORTRAN has ambiguities in it, and consequently 
the compiler is slowed down by it. 


Statements are punched in free form in ALGOL. There is no concern 
with card columns, blanks, or continuation cards. A semicolon is 
used to indicate the end of a statement. Also statements may be 
grouped to form compound statements by preceding them by BEGIN 
and following them by END, An example might be BEGIN A:=BtC; 

X: =Y; M: =N-1 END. BEGIN can be thought of as { and END as({ . 
Any number of statements can be included in a compound statement, 
and a compound statement can be placed anywhere in the program that 
a statement is called for. Hence compound statements can be nested 
inside compound statements without any restriction as to depth of 
nesting. , 


There is anIF THEN ELSE statement. It has the form: IF Boolean 
expression THEN Statement ELSE Statement. If the Boolean expression 
is true, the statement following the THEN is executed; if it is false, 

the statement following the ELSE is executed. The program, in either 
case, proceeds to the next statement. An example follows: | 
IF A=B THEN BEGIN X: =R; S:=T END ELSE Q:=L; 





The iterative type statement in ALGOL is the FOR statement. 
Examples will illustrate it. 
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FOR I: = 1,5, 12, 13, 152 DO Statement. The statement following the 
BO will be executed five times, the first time I will have the value 1, 
the second time 5, the third time 12, etc. 


FOR I: = 1 STEP 2 UNTIL 50 DO Statement. The statement following 
the DO will be executed repeatedly with I taking on the values 
1, 3, 5, 7, ... 49. (This is quite similar to the FORTRAN DO. ) 


FOR I: = I+] WHILE AO DO Statement, This statement will be 
executed repeatedly until the condition A> O becomes false, each time 
it is executed the value of I will be increased by 1. (For this example 
to make sense, I should have been assigned a value before this 
statement is reached in the execution of the program.) Finally, all 
three of these types can be combined into one FOR statement. The 
following example illustrates: 


FOR I: = 1,4,6 STEP 1 UNTIL 9, 15 WHILE A=B, 22 DO Statement. 
The statement will be executed with I taking on the values 1, 4, 6, 7, 
8, 9 and repeatedly for the value 15 until A=B becomes false, and 
then finally for the value 22. 


The following FOR statement is permitted: 


FOR I: = 10 STEP -1 UNTIL 5 DO Statement. This statement will be 
executed fori = 10, 9; 8, 7, 6, 5, in that order. 


All of these examples can be generalized by replacing any of the 
numbers used by any arithmetic expression. The index does not have 
to be an integer quantity, “As an example: 


FOR I: = M STEP J-K UNTIL N DO Statement 


There are several comments to be made about this example. It is 
permissible for the statement following the DO to change the values 
of J and K so that the incrementing constant would have to be 
recomputed each time through the loop. It may happen that J-K may 
change sign as the statement is executed. In view of the previous 
examples, this makes for a rather involved situation for determining 
when the program is to leave the loop. This is carefully defined in 
ALGOL, but it makes a quite unusual loop. 


Variables are declared in ALGOL in a manner similar to that used in 
‘FORTRAN IV. For example: 


REAL, X, Y, 2 


INTEGER A, B, I, Q 
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This states that the names X, Y, and Z refer to quantities which can 
take on as values any real number; these would normally be represented 
in floating point in the object program. The names A, B, I, and Q 
would take on only integer values. 


ALGOL also allows Boolean variables; these are variables that can 
take on only two values, true or false, and can then be used in an IF 
statement. An example: 


BOOLEAN D 
D: = A=B 
r D THEN... 


D will have the value true if A=B, otherwise it will have the value 
false. The IF statement will take the appropriate branch depending 
on the value of D. 


Arrays have few restrictions in ALGOL. Any number of subscripts 
are allowed, each subscript can be any integer expression, and 
subscripts can themselves be subscripted. An example: 


A (B+, M 6, 7] +T | will pick out an element from a two dimensional 
array named A in the following manner. It will compute B+J for the 
first subscript. For the second subscript it will find the appropriate 
number from the M array and add the value of T to it. 


Arrays are declared in the following manner: 


ARRAY A {-2:7, 4:8 | This states that the first subscript of the A 
array may take on values -2, -1, 0, 1, 2, 3, 4, 5, 6, 7 and the second 
subscript may take on values 4, 5, 6, 7, and 8. Hence fifty locations 
will be reserved for this array. Note that negative and 0 values for 
the subscript are permitted, unlike FORTRAN. 


A switch in ALGOL is a collection of labels the programmer may 
wish to branch to. It is declared as illustrated in the following example: 


SWITCH x:=A,B,C,D 








co To x [3]; 


The GO TO will transfer to C, which is the third label in the 
declaration. 
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Certain problems arise when a program is segmented, and different 
sections are coded by different programmers. The concept of blocks 
is used by ALGOL to handle this, A block is any compound statement 
which contains declarations. These must occur at the beginning of 
the block. The following example will illustrate some of the ideas 
used: 


P: BEGIN REAL A,B 














Q: BEGIN REAL A,C, OWN REAL D 











END 





END 





We have a block within a block. The variable C has meaning here 
only inside the inner block. In addition, if the program leaves this 
block and later returns, the value of C will not have been preserved. 
The storage location that is used could have been used by other parts 
of the program. C is called a local variable in the inner block. The 
same statements apply to B except that they apply to the outer block, 
which, if course, includes the inner block. B is called global in the 
inner block, There really are two different variables represented by 
the name A in this example. The A in the outer block cannot be 
referred to in the inner block because A in the inner block refers to 
the A declared there. D is defined as an OWN variable. This means 
that its value will be preserved after leaving the inner block, and will 
have this value when the program later returns to the inner block. 





Subroutines are called procedures in ALGOL. They are normally 
written in the program that uses them. The following example 
illustrates: 


PROCEDURE Q(X, Y, Z); X:=Y+Z; 








X: = R; Q(A, B,C); Y:=T; 
The procedure declaration defines the procedure. Later in the program, 


the procedure‘is called as indicated in the second statement on the last. 
line. This is executed as A: = BtC; 
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One of the rather extreme features in ALGOL, which is very costly 

to implement is called the copy rule. Normally when a subroutine is 
executed the values of the parameters are computed, and these are 
furnished as input to the subroutine. The copy rule states that the 
procedure should be executed as though the instructions to produce the 
parameter were copied into the procedure. As example illustrates: 


PROCEDURE P(K,M); FOR ISTEP 1 UNTIL 10 DO K: = K+M; 

If this procedure is called by the statement P(J, 142), the compiler 
must produce instructions to execute the procedure as though the 
statement following the DO read J: = J+12. 

If this procedure rule is not desired, variables may be defined with 
the declaration VALUE which means the arguments will be computed 
and the results of the computation will be furnished to the subroutine. 
Recursive procedures are permitted in ALGOL. This means that the 
definition of a procedure may call the procedure that is being defined. 
An example follows: 


PROCEDURE XA, B); A:=B+X: X(R,S); 


ALGOL allows dynamic storage allocation. The following example 
illustrates: 


BEGIN REAL A,B,C, INTEGER M,N 
J: = 1; 1: =1; 
L: M: = J+3; N:=I#2; 


BEGIN ARRAY A_[1:M, 1:N] 








END 


J: = J+3; I: = 147; GO TO L; 


Note that every time the inner block is entered, the program must 
reserve a different amount of storage for the Aarray. This feature | 
is implemented on the compiler IBM produced for SHARE, 


81 


B. JOVIAL 


Introduction 


JOVIAL, Jules Own Version of the International Algebraic Language, 
is based upon ALGOL-58. It was designed by the Systems Development 
Corporation to provide a flexible and readily understandable language 
for programming large scale computer-based command control 
systems, It is a procedure oriented language and at the present time 
there are compilers written for the IBM 7090, the IBM AN/FSQ31 
and the closely related AN/FSQ-32, the IBM AN/FSQ-7, the Philco 
2000 and the CDC 1604. The 7090, 2000 and 1604 compilers have 
been made available through the computer users of SHARE, TUG, and 
CO-OP, The JOVIAL compilers are written and maintained almost 
entirely in JOVIAL. They consist of two main parts: first, a 
Generator which transforms JOVIAL programs into Intermediate 
Language; second, a Translator which further transforms them into 
machine language. Translators ordinarily incorporate a completed 
symbolic assembly phase. Compilers for the Q-7, the 2000 and the 
1604 use essentially the same common Generator and Intermediate 
Language, thus one Generator and three Translators were produced, 
JOVIAL compilers range in size from 50 to 60 thousand machine 
instructions, require about 5 man years of work to write a new 
translator and get a compiler running on a new machine. 


Statements 


It is convenient to recognize three classes of statements in JOVIAL: 
Simple statements which express primitive data processing action, 
complex statements which incorporate simple or compound sta ements 
within them, and compound statements which group together while 
strings of statements - simple, complex or compound. The compound 
statement is made up of a series of simple statements and enclosed in 
between the BEGIN and END brackets. 


The NAME statement is a statement, label the same as the label 

in an SPS program. In write-ups of the JOVIAL language, IDENTIFIERS 
and NAMES are synonymous. The JOVIAL NAME is an arbitrary 
though usually mnemonic alphanumeric symbol, at least two characters 
long, which may be punctuated for readibility by the ' mark, NAMEs 
serve to identify the elements of the JOVIAL program information 
environment - that is, statements, switches, procedures, items, 
array items, tables, string items and files. Except for context 
designed statement names all JOVIAL names must be declared, either 
explicitly in the program or implicitly in the Compool or in the 
procedure library. (A Compool is a library of system environment 
declarations and storage allocation parameters). A NAME is needed 
only when the statement is to be executed out of sequence. 
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The ASSIGNMENT statement assigns the values specified by a formula 
to be value designated by the variable. The two functions of this. 
statement are data manipulation and arithmetic. It takes the form 

a = b $ where = sign is called the assignment separator. 


The EXCHANGE statement exchanges the value designated by a pair 
of variables. The effect of an EXCHANGE statement on either of the 
variables involved is as if each has been assigned the value designated 
by the other. Consequently, the rules of ASSIGNMENT pertain and 
both variables must be of the same type: numeric, literal, status or 
Boolean. The statement is of the form a = = b $ where = = is called 
the exchange separator. As with all other statements it must be 
followed by a dollar sign. 


There are two types of TRANSFER statements: the unconditional 
statement is a GOTO x $. There are no spaces between the two words 
and the translator will substitute an unconditional branch. The 
conditional transfer statement is an IF where the value to be tested 
appears after the statement. If the IF statement is true, the next 
statement is executed. If the statement is false, the next statement 
either compound or simple is bypassed. An example is: 


IF ALFA EQ BETA $ 


BRCH. GOTO STEP2 $ 


STEP 1. ALFA=GAMMA+t+DELTA 
STEP2. DIFF=GAMMA-DELTA 


A LOOP statement is a complex statement consisting of a list of FOR - 
clauses which establish the LOOP counters, and a simple or compound 
statement, which forms the repeatedly - executed body of the loop. 
The LOOP statement may contain from one to three FOR statements, 
The one factor FOR statement defines a constant and gives it a value. 
Example: FOR I= 23 $. The statement would be initialized by setting 
I= to 0, and after each repetition I must be set toI+ 1]. The test for 
the maximum I would be made by the programmer. The two factor FOR 
statement sets the initial value of I and specifies the increment value, 
for example FOR 1= 0, 1 $. In this case, I will initially be 0 and 
incremented by 1] each time. The test for the final value of I would 

be included by the programmer. The three factor FOR statement 
specifies the initial value, the increment value and the test value, 

i. e., FOR I= 0, 1, 9 $. The initial value is 0, the increment value 
is + 1 and the loop would be terminated when I exceeds 9. The test 

is automatically included by the processor. Decrementation as well 

as incrementation is possible. Example: FOR I= 9, - 1, 0. 
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OPEN OUTPUT or OPEN INPUT statement puts the particular device 
being called into ready status. After this statement a number of — 
output statements or input statements can be executed, bringing in 

or reading out data. The last statement to be executed in the sequence 
of OUTPUT and INPUT statements must be SHUT OUTPUT or SHUT 
INPUT. 


A ARITHMETIC statement will contain any combination of the normal 
arithmetic functions. Parenthesis are used to determine order of 
operation. JOVIAL and FORTRAN both use the same set of symbols to 
denote arithmetic operations. , 


The CONTROL statement has four variations. The unconditional 
branch is a GOTO n, where n is a statement number. In JOVIAL 

the n would refer to a NAME statement. The computed GOTO would 
branch to some statement depending upon the value of the constant 
being tested. it takes the form GOTO (ny » Ta, -»., ny), I. If I were 
1 the program would branch to n,, if I were 4, the program would 
branch to the fourth value in the parenthesis. The IF statement tests 
the value of an expression or a variable and depending upon the value, 
will branch to one of three places that was specified in the statement. 
The form of a statement is IF (a) nj, nz, n3 where n, is the statement 
branch to if (a) is negative, n2 is branched to, if (a) is = to 0, and n3 
is branched to if (a) is positive. 


The DO statement is the one used in looping. The starting value, final 
value and increment values are given in the statement, similar to the 
three factor FOR statement. Decrementation, however, is not 
permitted. The format is DO - I= Nj, Nz, nz. nz may be omitted if 
the value is 1. The statement number after the DO is the last statement 
of the DO to be executed. The tests for the increment is automatically 
inserted by the processor after the last statement. 


The SPECIFICATION is a non-executable statement, for it just 
reserves space and does not generate any machine instructions. An 
example of this is the DIMENSION statement which sets aside core 
storage, where DIMENSION A (20) will reserve enough storage to 
contain 20 floating point data words. 


The I/O statements are of the form of what actually is to be 
accomplished. To read card, READ is specified. The unit does not 
have to be placed in ready status first. 


Constants 


A fixed point number in FORTRAN is so designated because it is an 
integer (no decimal point may appear) and the variable name begins 
with the letters 1- N. In JOVIAL there must be a space followed by 
an A after the item name. After the A the number which specifies the 
number of positions in the constant is given and then an S or U for 
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Signed or Unsigned. An example would be SUM A7S. 


A floating point constant in FORTRAN is designated by the letter 

with which the constant begins. It may not begin with the letters 

I through N for they are reserved for fixed point numbers unless, 

of course, a TYPE statement is used within the FORTRAN IV program. 
The number must contain a decimal point and may be in the E notation 
(.32E12). In JOVIAL a floating point number has an F after the item 
name - example SUMF. 


Dual constants appear in JOVIAL and the same value is set up in each 
half of the item. A '"'D" comes after the item name, n which specifies 
the sum of the positions in each constant, S or U for signed or 
unsigned and + or =n for the number of fractional bits, example: 

SUM D6 S+2. 


Status constant is in the mnemonic label which denotes one of the 
values of the status item. The form is V (GOOD), where GOOD 
refers to the status of sum item. 


The literal constant is either in Hollerith in which case it is preceded 

by an H or in standard transmission code in which case it is preceded 

by a "TIT", Standard transmission code is the octal representation of the 
constant in the machine. Example: 44H (THIS IS A LITERAL CONSTANT 
IN HOLLERITH CODE) 56P (THIS IS A LITERAL CONSTANT IN 
STANDARD TRANSMISSION CODE). 


Figure 35 
Sample of a JOVIAL Program 


Summing Ten Floating Point Numbers 


START JOK 1. 
TABLE NUMBR R 10 $ 
BEGIN 
ITEM BBF $ 


BEGIN 3.0 6. 10.2 0.0 20.0 1.23 .08 0.32 12.0 
IE-2 5.0 END 


END 
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ITEM SUM F $ 
STEP1. SUM = 0.0 $ "INITIALIZE SUM" 
STEP2. FOR I= ALL (BB) $ 


SUM + BB ($I$) $ "COMPUTE SUM" 
TERM $ 
Explanation of Jovial Program 


The program must begin with a START card and end with a TERM $ 
card. TERM must have a § termination separator following it. The 
program name which is optional in the START statement must be 
followed by a (.) which is a statement name separator. 


Table declaration encompasses NUMBR which is the name of the 

table. R signifies that this is a rigid (fixed) length (''V'' would signify 
variable), and 10 is the actual length of the table. The table declaration 
must be followed by a $ which terminates the declaration. 


The composition of the table must be defined between BEGIN and END 
statements. 


Item declaration declares item comprising the entry in the table. In 
this case the item is identified as BBF. This means that it is a 
floating point. Once again the $ terminates the declaration. 


The BEGIN and END statements define the parameter list which 
comprise the table and the first ten entries of the item BBF. 


Statement name - STEP 1. must be followed by a (.) which is the 
statement name separator. The statement sets SUM=to 0.0. The 
"COMMENT'" tells the processor to ignore what comes next but 
print it on the program listing. 


STEP 2. Iis the index. The loop will be accomplished ten times 
since the statement says FOR ALL (BB). Since the order of taking 
the sum is unimportant ALL may be used rather than another type 
of FOR statement. Included with the example is an example of the 
same program written in FORTRAN. This can be used asa 
comparative tool in evaluating the language of JOVIAL. 
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Figure 36 


Sample of FORTRAN Program 


Cc PROGRAM TO SUM 10 FLOATING POINT NUMBERS 
DIMENSION BB(10) 
SUM = 0.0 
‘ DO 103, I=1, 10 
103 SUM = SUM+BB(I) 
PAUSE 


END 


Additional Language Specifications 


The statements to this point have been kept on a fairly simple level. 
In order to evaluate all of the ramifications of JOVIAL, however, 
additional specifications of the language should be perused. 


Regarding the processing of clauses, strings of JOVIAL symbols, 
(i.e., delimiters, identifiers and constants) form clauses such as 
item descriptions which describe values; variables which designate 
values; and formulas which specify values. In general, symbols may 
be separated by comments or by an arbitrary number of blanks. 
However, no separation is needed when one or both of the signs so 
joined is a mark. 


Clauses are combined with certain delimiters to form declarations 
and statements which are classified as sentences of JOVIAL. 
Statements assert actions that the program is to perform (normally 
in the sequence in which they are listed) and declarations describe the 
information environment in which the actions are to occur. 


In JOVIAL, values other than those denoted by constants or used only 
as intermediate results, or for controlling loops must be formally 
declared as items: simple items, array items, table items or 

string before they can be referenced. When not a part of a table 
declaration, an item declaration defines a simple item with a single 
value. A mode declaration starts a new normal mode for the implicit 
declaration of all subsequently referenced (and otherwise undefined) 
simple items. An array declaration describes the structure ofa 
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collection of similar item values and also provides a means of 
identifying this collection with the single item named. In this manner, 
arrays of any number of dimensions may be declared. 


Functional modifiers facilitate the manipulation both of larger data 
elements (entries and tables) and the smaller data elements (segments 
of machine language symbols representing item values). A brief 
description of each modifier will suffice. The NENT modifier isa 
vital parameter in table processing of Number of ENTries. The 
functional modifier NENT allows this unsigned integral value to be 
designated for rigid length table. The NWDSEN modifier is another 
parameter in table processing and this is the amount of storage 
allocated to a table entry and thus to the entire table. This unsigned 
integral value which is constant throughout the execution of the program 
is expressed in the Number of WorDS or registers per ENtry. 

NW DSEN is needed in executive programs that do dynamic storage 
allocation. The ALL modifier creates a loop with an undefined 
direction of processing. The ENTRY modifier allows an entry to be 
treated as a single value represented by a single composite machine 
language symbol. 


The BIT and BYTE modifiers are worthy of mention because of System 
360. The BIT modifier allows any segment of the BIT string 
representing the value of any item to be designated as an unsigned 
integral variable. Similarly, the BYTE modifier allows any segment 
of the BYTE string representing the value of any literal item to be 
designated as a literal variable. The MANT and CHAR modifiers 

are involved with floating point machine language symbol representation, 
and by using them, either component of any floating point item can be 
designated as a fixed point variable. The least significant bit of any 
loop counter or floating-or-fixed-point item can be designated as a 
Boolean variable: True, if it represents a magnitude of one and false 
if it represents a magnitude of zero by the ODD modifiers. 


Summary 


JOVIAL is a general purpose procedure-oriented are largely computer 
independent programming language, derived from ALGOL 58 with the 
major extensions of: input-output notation, a more elaborate 
description capability, the ability to manipulate fixed-point numeric 
values, and the ability to manipulate symbolic and other non-numeric 
values including machine language symbol segments. 


Some thirty computer installations have received JOVIAL compilers 
through the users group of SHARE, TUG and CO-OP, and has been 
adopted by the Navy Command Systems Support Activity as the interim 
standard programming language for Navy Strategic Command Systems. 
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C. SNOBOL 


Strings and String Names 


The basic data structure in SNOBOL is a string of symbols, where 
names {a string of numerals and/or letters with medial periods) are 
assigned to strings for reference. The string named LINE. 1 may 
have the contents: TIGER, TIGER, BY MY BED. 


String Formation 


The string name LINE. 1 with the above contents is formed by the rule: 
LINE. 1 = "TIGER, TIGER, BY MY BED" where the quotation. marks 
specifies the literal contents. Any symbols except quotation marks can 
be placed within the quotation marks. 


Strings can also be formed by concatenation. The rule: "TIGER, 
TIGER," "BY MY BED" will produce the same results as the earlier 


example. 


Previously named strings can be used to form new ones, using the 
rule, EXAMPLE = LINE. 1 forms a string, EXAMPLE, with the same 
contents as the string LINE.1. Literals and named strings can be 
used in formation, similar to: 


LINE. 1 = "TIGER, TIGER, BY MY BED" 

LINE. 2 = "IN YOUR SHADES OF BLACK AND RED" 
LINE. 3 = "NEVER WILL I SLUMBER THROUGH," 
LINE. 4 = ''WHEN I TURN MY GAZE ON YOU." 


TEXT = LINE.1'"/" LINE .2"/" 
LINE .3 "'/'' LINE. 4 will form a composite 
string with slashes separating the lines ina 
conventional manner. 
The spaces between string names and literals serve as break 
characters for distinguishing the elements to be concatenated, with 


one space required for separation. 


In forming the string, the string itself may be used. After performing 
the two rules: 
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NUMBER = "6" 
NUMBER = NUMBER NUMBER #0"; 
the string, NUMBER, will contain the literal '660". 


Pattern Matching 

To determine whether the string, LINE. 1 contains the literal "TIGER" 
the rule would be: LINE. 1 "TIGER". This is similar to a formation 
rule, but without the equal sign. LINE. 1 is scanned from left for 

an occurence of the five literals 'TIGER" in succession. A pattern 
matching rule may succeed or fail. If LINE. 1 is formed in the 
previous example, the scan would be successful, and scanned string 
is not altered. The pattern may be specified by the concatenation of 
a number of literal and string names; TEXT LINE. 1 ''/" LINE. 2 
specifies the scan of a string named TEXT for an occurence of the 
contents of the LINE. 1, immediately followed by the literal ''/'' and 
in turn, immediately followed by the contents of the string LINE. 2. 


String Variables 


If it is required to know whether a string contains one sub-string 
followed by another, but with the second sub-string not necessarily 
immediately after the first, a string variable is introduced to permit 
this. The rule: LINE. 1 "TIGER" *FILLER* "BED" is of this kind. 
The question is whether LINE. 1 contains "TIGER" followed by ''BED" 
with perhaps something between. The symbols *FILLER* represents 
a string variable which takes care of this "something". If LINE. 1 is 
formed as we have previously shown, the scan would be successful. 


A string variable may be any string name bounded by asterisks. A 
by-product of matching a pattern containing a string variable is the 
formation of a new string that has the name furnished between the 
asterisks of the string variable, 


Replacement 


In this string LINE. 2 it is desired to replace ''HUES" by ''SHADES". 
This would be accomplished by: LINE.2 "HUES" "SHADES". This 
scans LINE.2 for the occurence "HUES", and if the scan is successful, 
"HUES" is replaced by "SHADES". LINE. 2 will then become "IN 
YOUR HUES OF BLACK AND RED". 


If the scan fails, the string being scanned is not altered. Any string 
formed is a result of a successful pattern match of a string variable 
on the left side of the equal sign and can be used in the replacement 
on the right side. Thus: LINE. 1 "BY" *FILLER* 'BED" = FILLER, 
would result in the deletion of 'BY"' and 'BED" from LINE. 1. 
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Rules 


A rule may consist of four parts, separated by a blank in the 
following order: 


1. A STRING to be manipulated, i. e. STRING REFERENCE: 
2. A LEFT SIDE specifying a pattern; 
3. An EQUAL SIGN; 
4, A RIGHT SIDE specifying a replacement. 
The string reference is mandatory, while the rest of the rule parts 
may be absent, depending upon the particular rule. "GO TO!" consists 
of a slash followed by one or more of the following parts: 
1, An unconditional transfer which has the form (BA). 
Upon the completion of the statement the next statement 
to be executed is the statement with the label BA. 
2. A conditional transfer on failure has the form F(BB). 
If the statement fails the statement with the label BB 


is to be executed next. 


3. A conditional transfer on success has the form S(BC). 
Transfer is made to BC on successl., 


Arithmetic 


Simple Arithmetic may be performed on strings whose contents are 
integers. L = C+X would form the string named L containing the 
arithmetic sum of the contents of strings C and X. 


Indirectness 


Indirectness is accomplished in SNOBOL by writing $ sign in front of 
the string name. If the string FACTOR contains the literals "SUM" 
writing $ FACTOR is the same as writing 'SUM". 


Input /Output 


Input and Output are accomplished by the use of the two commands, 
READ and PRINT following the string references SYS. 
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Language Extension 


Additional features are being planned for in SNOBOL in the near 
future, including extended input/output facilities consistent with the 


string orientation. 
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D. NELIAC 


Introduction 


The NELIAC compiler, sponsored primarily by the Navy Electronic 
Laboratory, was started in July 1958 and completed within the 
following six months. Three special characteristics of NELIAG 
should be recognized: NELIAC compilers are self-compilers; most 
NELIAC compilers have been kept relatively short and simple; most 
NELIAC compilers have compiling speeds of many thousands of 
computing words per minute. A programmer familiar with the 
language can read, understand and improve any given compiler, and 
can recompile a true version of a compiler quite cheaply, since some 
recompile in less than a minute. 


Operators 


The NELIAC language is based upon the use of 25 symbolic operators, 
including punctuation, arithmetic, and relational symbols. Meanings 
are ascribed to these operator symbols on the basis of Current 
Operator-Operand-Next Operator combination. The use of symbolic 
operators reduces the number and complexity of rules which must 

be kept in mind reduces the problems of documentation. The language 
includes the ability to handle bits within computer words, to treat 
input/output without direct format statements, to insert machine 
language and to address computer language memory directly. 
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